Last modified: 2014-03-05 16:42:54 UTC
When I search a word, the search engine fail to find the same word when it has an apostrophe before, so a search for "apostrophe" doesn't find "L'apostrophe" occurrence. In French the apostrophe is not part of the word, its a contraction for "La apostrophe". For example : https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=default&search=Arc-en-Ciel&fulltext=Search "Arc-en-Ciel" doesn't match any "L'Arc-en-Ciel" (with L') directly. Compare with a search "L'Arc-en-Ciel" https://en.wikipedia.org/w/index.php?search=L%27Arc-en-Ciel&title=Special%3ASearch&fulltext=1 In French, like in English, apostrophe should be not indexed as part of the word. Note : its the same bug than https://bugzilla.wikimedia.org/show_bug.cgi?id=9598 (old) See also a different apostrophe usage in Ukrainian https://bugzilla.wikimedia.org/show_bug.cgi?id=21002
This problem still exists in CirrusSearch. Migrating bug to correct queue.
The problem here is that the language rules are customized for the wiki's language. Elision is handled in French but not English. I wonder how much harm it would be to just add it to English (and maybe other languages) as well. Here are the term prefixes that would be removed: l' m' t' qu' n' s' j' d' c' jusqu' quoiqu' lorsqu' puisqu' We wouldn't add it to the plain analyzer so if you search for "l'avion" then "l'avion" will be worth more then "avion".