Last modified: 2014-04-14 05:02:54 UTC
BUG MIGRATED FROM SOURCEFORGE http://sourceforge.net/tracker/index.php?func=detail&aid=681366&group_id=34373&atid=411192 Originally submitted by Nobody/Anonymous - nobody 2003-02-06 01:18 Stopwords in English can be valid nontrivial words in other languages. Please allow searching them! We cannot search "an", "he", "me" etc on Polish Wikipedia. And we cannot search "see also" etc as well which were put and left (unfortunatelly) without translating them (many pages!) --Youandme ------------------------- Additional comments ------------------------ Date: 2003-02-06 20:43 Sender: SF user vibber When we upgrade mysql, I'll see if I can remove the stopword list. (It's a compile-time thing.) ------------------------------------------------- Date: 2003-02-06 20:44 Sender: SF user vibber When we upgrade mysql, I'll see if we can remove the stopword list. (It's a compiled-in thing, apparently.)
See also: http://meta.wikimedia.org/wiki/Stop_word_list http://en.wikipedia.org/wiki/Wikipedia:Common_words%2C_searching_for_which_is_not_possible
we dropped mysql 3.x support with MediaWiki 1.6.
MySQL 4 and later still have a stopword list, though they aren't as unpleasant as the behavior in previous versions. It would be nice if we could reliably disable it per table or something...
Yes, please override it with an own customizable list for users without lucene search.
Created attachment 7143 [details] MaxSem's slow patch Best I could come up with - but still pretty slow, maintenance/rebuildtextindex.php runs 30% slower with it. Tested several solutions (oneo of them could be seen in the patch, commented out), but none of them had satisfiable performance. I therefore don't dare to commit it into the trunk. Leaving the patch here so that other folks could take a look at my approach.
*** Bug 25446 has been marked as a duplicate of this bug. ***
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
See http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html which says "To override the default stopword list, set the ft_stopword_file system variable. ... if you change the stopword file itself, you must rebuild your FULLTEXT indexes after making the changes and restarting the server. To rebuild the indexes in this case, it is sufficient to do a QUICK repair operation: REPAIR TABLE tbl_name QUICK;" So, while you can't "reliably disable it per table", you *can* disable it without compiling by setting ft_stopword_file to "", restarting, and then rebuilding the table.
(In reply to comment #8) > So, while you can't "reliably disable it per table", you *can* disable it > without compiling by setting ft_stopword_file to "", restarting, and then > rebuilding the table. A task for installer?
Just checking: in the times of Cirrus Search, are MySQL's stopwords in English causing any trouble to searches in non-English wikis?
No, nothing like this from the SQL search implementation affects Cirrus' implementation.