Last modified: 2014-07-07 12:21:25 UTC
Hello: I think it would be very useful to have the possibility to search within the history of the articles of Wikipedia. This way, for example, one could avoid writing something discarded reasonably several times on past editions. This new feature could be a new namespace box situated next to the ones already avaiable: (Main) Talk User User talk Wikipedia Wikipedia talk Image Image talk MediaWiki MediaWiki talk Template Template talk Help Help talk Category Category talk Portal Portal talk I hope this new feature will be soon avaiable in all the language versions of Wikipedia. Thanks and regards
If not implemented with sufficient intelligence, this would increase the size of the search index by a factor of about 16 (for enwiki, as an example). If done intelligently (only indexing deltas) I figure it would only be two to four times the size, but that would probably not be particularly easy. Either way I don't foresee this happening soon. Note that if implemented, this would largely obviate the need for bug 639.
Too expensive at present, but we'd love to have this eventually.
*** Bug 13850 has been marked as a duplicate of this bug. ***
Thanks Brion, sorry for the duplicate. A possible way to make up for the lack of this feature until it's implemented occurs to me, a "download complete history (so many KB)" link, allowing to dump the whole history to the user's desktop as a (zipped?) html file concatenating all versions (ideally but not necessarily with diffs highlighted), which he could then search at leisure on his computer.
You can in fact download the full history of a given page via Special:Export. It's not necessarily super pretty, but should work.
Thanks Brion, just tried it, it worked, and indeed I was able to search the downloaded xml quite easily directly in Firefox, but unfortunately: 1/ it lists only revisions 1 to 100 (latest revision listed in Anode article is ~2 years old) 2/ it's not easy to access (had never used the toolbox before, took me some time to find that export page) A full download, via a link in the article's history page labeled e.g. : Full history in xml format (*** revisions, *** kB) would be wonderful.
*** Bug 15019 has been marked as a duplicate of this bug. ***
(In reply to comment #1) > If not implemented with sufficient intelligence, this would increase the size > of the search index by a factor of about 16 (for enwiki, as an example). If > done intelligently (only indexing deltas) I figure it would only be two to four > times the size, but that would probably not be particularly easy. Either way I > don't foresee this happening soon. Note that if implemented, this would > largely obviate the need for bug 639. > We can simply implement the infrastructure in core mediawiki, and leave the indexing task to the search extension like Lucene-search. This will let the search extension to decide that it would like to provide the historical search or not. (In reply to comment #2) > Too expensive at present, but we'd love to have this eventually. > Too expensive is about indexing task? Is it also too expensive to just implement the infrastructure?
The Diffindexer (https://github.com/whym/diffindexer) in combination with Wikihadoop (https://github.com/whym/wikihadoop) offers exactly this functionality.
*** Bug 24641 has been marked as a duplicate of this bug. ***
[Removing RESOLVED LATER as discussed in http://lists.wikimedia.org/pipermail/wikitech-l/2012-November/064240.html . Reopening and setting priority to "Lowest". For future reference, please use either RESOLVED WONTFIX (for issues that will not be fixed), or simply set lowest priority. Thanks a lot!]
As you all know, this functionality is already offered on a per-page basis by WikiBlame: http://wikipedia.ramselehof.de/wikiblame.php I'm not convinced that this is something for Special:Search and it's surely more closely related to the history topic, hence changing component.
*** Bug 59620 has been marked as a duplicate of this bug. ***