Last modified: 2013-12-02 12:00:52 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T11598, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 9598 - Apostrophes should not be indexed as part of words
Apostrophes should not be indexed as part of words
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
lucene-search-2 (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-04-16 22:35 UTC by Gildas
Modified: 2013-12-02 12:00 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gildas 2007-04-16 22:35:48 UTC
Hello everybody !

I'm french, i use the mediawiki for myself.

I have trouble with it:

for example:

i made a page "la Manoeuvre d'ortolani"

i wrote some words:

"la manoeuvre d'ortolani est destinée à tester la hanche des nourissons"

well, ok, it's in french ;-)


the big bad trouble is:

when i search the word ORTOLANI, the mediawiki find ... nothing because for
mediawiki 
"D'Ortolani" is ONE WORD but in french "D'ortolani" is a contraction of TWO WORDS
"DE + ORTOLANI" = "D'ORTOLANI"

I'm tired to modified all the page by writing this for trying to correct this:

"Manoeuvre d'ortolani (ortolani)"

could you try to fix it ?

thank you

sorry my english is ... bad really.
Comment 1 Antoine "hashar" Musso (WMF) 2007-05-04 20:25:36 UTC
The search engine use mysql Full-Text search engine, according to
the documentation [1] the apostrophe ' and underscore _ are considered
as part of the word.

The bug got fixed in mysql 5.1.6 [2]

Marking bug as 

[1] http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html
[2] MySQL bug report : http://bugs.mysql.com/bug.php?id=14194
Comment 2 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-05-06 00:52:52 UTC
That's MySQL fulltext.  Wikimedia doesn't use that, so I doubt the bug was filed
with that in mind.  This should be filed against Lucene, presumably.  I don't
know if this is fixable on our end, though?
Comment 3 Antoine "hashar" Musso (WMF) 2007-05-06 00:54:53 UTC
Lucene doesn't have this problem. reclosing.
Comment 4 Andre Klapper 2013-03-26 11:24:34 UTC
[Merging "MediaWiki extensions/Lucene Search" into "Wikimedia/lucene-search2", see bug 46542. You can filter bugmail for: search-component-merge-20130326 ]
Comment 5 Akeron 2013-11-29 11:01:59 UTC
This bug is still here

For example : https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=default&search=Arc-en-Ciel&fulltext=Search  "Arc-en-Ciel" doesn't match any "L'Arc-en-Ciel" (with L') directly.

Compare with a search "L'Arc-en-Ciel" https://en.wikipedia.org/w/index.php?search=L%27Arc-en-Ciel&title=Special%3ASearch&fulltext=1

The "L'" should not be a part of the word, its another word.
Comment 6 Andre Klapper 2013-12-02 11:26:54 UTC
Akeron: Could you please file a new bug report? This one got closed six years ago and nowadays issues for this problem to happen again are likely different. Thanks!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links