Last modified: 2009-05-04 23:42:49 UTC
If you search for a word without diacritics, the search results include matches for the same word with diacritics. Similarly if you search for a phrase with a hyphen, the results include matches with an en dash. (No doubt there are various other similar rules, and this behaviour is very much desired.) However, when you get such a match, the matched text is not displayed in the list of search results, i.e. you get just a link to the relevant page, without the extract(s) from that page's text which you would normally see in the results list if the match were exact.
Can you give an example URL? I'm guessing this is on Wikipedia (or other Wikimedia site) and may be due to current mismatches between how the Lucene backend matches words and how the front-end matches them in the result highlighting. If so, I believe this should be improved when the next version of the Lucene backend rolls out which has support for doing highlighting itself.
Example URLs: http://en.wikipedia.org/wiki/Special:Search?search=Banach-Steinhaus&fulltext=Search (first result returned is Stefan Banach, but text is missing because in that article the reference contains an en dash rather than a hyphen) http://en.wikipedia.org/wiki/Special:Search?search=sniezycowy&fulltext=Search (two results returned, but text missing because the articles contain Sniezycowy with Polish diacritics)
This also happens for stemmed words, transliterations and words in different scripts (variants), and is as noted in #1 due to the mismatch between mediawiki highlighting and backend functionality. It will be solved when we switch highlighting to backend.
Mass close WONTFIX open Lucene Search issues because extension Lucene Search was removed, and replaced by MWSearch. Please set to REOPENED if behaviour still exists with a another component, and update the domain.
Mass REOPEN after discussion with Robert. Domain: Wikimedia/lucene-search-2. Assigned to maintainer.
Using our custom snippet-extraction backend on wmf wikis so this doesn't happen any more.