Last modified: 2014-07-07 12:21:25 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T12643, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 10643 - Search within history of articles
Search within history of articles
Status: REOPENED
Product: MediaWiki
Classification: Unclassified
History/Diffs (Other open bugs)
unspecified
All All
: Lowest enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/wiki/Special:...
:
: 13850 15019 24641 59620 (view as bug list)
Depends on:
Blocks: 20784
  Show dependency treegraph
 
Reported: 2007-07-19 15:30 UTC by edupedro
Modified: 2014-07-07 12:21 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description edupedro 2007-07-19 15:30:00 UTC
Hello:

I think it would be very useful to have the possibility to search within the history of the articles of Wikipedia. This way, for example, one could avoid writing something discarded reasonably several times on past editions.

This new feature could be a new namespace box situated next to the ones already avaiable:
(Main)  Talk  User  User talk  Wikipedia  Wikipedia talk  Image  Image talk  MediaWiki  MediaWiki talk  Template  Template talk  Help  Help talk  Category  Category talk  Portal  Portal talk

I hope this new feature will be soon avaiable in all the language versions of Wikipedia.

Thanks and regards
Comment 1 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-07-19 20:18:09 UTC
If not implemented with sufficient intelligence, this would increase the size of the search index by a factor of about 16 (for enwiki, as an example).  If done intelligently (only indexing deltas) I figure it would only be two to four times the size, but that would probably not be particularly easy.  Either way I don't foresee this happening soon.  Note that if implemented, this would largely obviate the need for bug 639.
Comment 2 Brion Vibber 2007-10-01 13:15:54 UTC
Too expensive at present, but we'd love to have this eventually.
Comment 3 Brion Vibber 2008-04-29 20:21:12 UTC
*** Bug 13850 has been marked as a duplicate of this bug. ***
Comment 4 micheljull 2008-04-29 21:38:03 UTC
Thanks Brion, sorry for the duplicate. A possible way to make up for the lack of this feature until it's implemented occurs to me, a "download complete history (so many KB)" link, allowing to dump the whole history to the user's desktop as a (zipped?) html file concatenating all versions (ideally but not necessarily with diffs highlighted), which he could then search at leisure on his computer.
Comment 5 Brion Vibber 2008-04-29 21:51:19 UTC
You can in fact download the full history of a given page via Special:Export. It's not necessarily super pretty, but should work.
Comment 6 micheljull 2008-04-30 11:07:28 UTC
Thanks Brion, just tried it, it worked, and indeed I was able to search the downloaded xml quite easily directly in Firefox, but unfortunately:

1/ it lists only revisions 1 to 100 (latest revision listed in Anode article is ~2 years old)

2/ it's not easy to access (had never used the toolbox before, took me some time to find that export page)

A full download, via a link in the article's history page labeled e.g. :

Full history in xml format (*** revisions, *** kB)

would be wonderful.
Comment 7 Brion Vibber 2008-08-04 06:00:33 UTC
*** Bug 15019 has been marked as a duplicate of this bug. ***
Comment 8 Anon Sricharoenchai 2009-09-24 10:58:48 UTC
(In reply to comment #1)
> If not implemented with sufficient intelligence, this would increase the size
> of the search index by a factor of about 16 (for enwiki, as an example).  If
> done intelligently (only indexing deltas) I figure it would only be two to four
> times the size, but that would probably not be particularly easy.  Either way I
> don't foresee this happening soon.  Note that if implemented, this would
> largely obviate the need for bug 639.
> 

We can simply implement the infrastructure in core mediawiki, and leave the indexing task to
the search extension like Lucene-search.
This will let the search extension to decide that it would like to provide the historical search or not.

(In reply to comment #2)
> Too expensive at present, but we'd love to have this eventually.
> 

Too expensive is about indexing task?
Is it also too expensive to just implement the infrastructure?
Comment 9 Diederik van Liere 2011-11-29 20:35:07 UTC
The Diffindexer (https://github.com/whym/diffindexer) in combination with Wikihadoop (https://github.com/whym/wikihadoop) offers exactly this functionality.
Comment 10 Diederik van Liere 2011-11-30 00:33:19 UTC
*** Bug 24641 has been marked as a duplicate of this bug. ***
Comment 11 Andre Klapper 2012-12-12 13:32:02 UTC
[Removing RESOLVED LATER as discussed in
http://lists.wikimedia.org/pipermail/wikitech-l/2012-November/064240.html .
Reopening and setting priority to "Lowest".
For future reference, please use either RESOLVED WONTFIX (for issues that will
not be fixed), or simply set lowest priority. Thanks a lot!]
Comment 12 Nemo 2013-08-29 16:44:34 UTC
As you all know, this functionality is already offered on a per-page basis by WikiBlame: http://wikipedia.ramselehof.de/wikiblame.php
I'm not convinced that this is something for Special:Search and it's surely more closely related to the history topic, hence changing component.
Comment 13 Andre Klapper 2014-07-07 12:21:25 UTC
*** Bug 59620 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links