Last modified: 2010-05-15 15:42:43 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T9726, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 7726 - Searches less than 4 characters long don't work
Searches less than 4 characters long don't work
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
Other Linux
: Normal normal (vote)
: ---
Assigned To: Andrew Garrett
: 13423 (view as bug list)
Depends on:
Blocks: 42
  Show dependency treegraph
Reported: 2006-10-27 12:14 UTC by Henry Cocozzoli
Modified: 2010-05-15 15:42 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---

Proposed patch (1.50 KB, patch)
2008-10-27 14:22 UTC, Andrew Garrett

Description Henry Cocozzoli 2006-10-27 12:14:40 UTC
Search for "files".   Get a list of docs.
Search for "JPG".  Get no list, even if it appears on previous search.

Wiki running on Ubuntu Linux 6.06 LTS
MediaWiki: 1.8.2 (r14403) 
PHP: 5.1.2 (apache2handler) 
MySQL: 5.0.22-Debian_0ubuntu6.06.2-log 
visited from
Retrieved from "http://calvin/wiki/index.php/Special:Version"

Accessed from Win XP SP2  IE 6 or 7.
Comment 1 Brion Vibber 2006-10-27 12:16:26 UTC
See the FAQ.
Comment 2 Brion Vibber 2008-03-18 20:37:23 UTC
*** Bug 13423 has been marked as a duplicate of this bug. ***
Comment 3 Brion Vibber 2008-03-19 00:39:03 UTC
Reopening this...

We could work around this by applying a transformation on input to bypass the server-wide length limit. Lame perhaps, but at least it would make things work for people. :)
Comment 4 Andrew Garrett 2008-10-27 14:22:07 UTC
Created attachment 5477 [details]
Proposed patch

Proposed patch.

Also cleans up the use of /e modifier (which isn't allowed in new code at the moment) for the rest of the function in question, splitting out into callback functions.

Essentially, it just prefixes any words 4 chars or shorter with 'SMALL', bringing them above the threshold.

We might want to consider looking for certain 'stop' words and disallowing those regardless...
Comment 5 Brion Vibber 2008-11-25 02:45:18 UTC
Fixed in r43920 -- Short words are padded so they now get indexed. Yay!

Adapted part of Werdna's patch, with some additional cleanup:
* Using 'U00' to pad instead of 'SMALL' to reduce false positives (eg search for "small*" could match "Smallville" and "SMALLc")
* Checking server's ft_min_word_len variable to see if we need to do anything. This preserves index compatibility with existing installations which have customized their index length.
* Some further cleanup on redundant code -- just toss everything through lc() and be done with it :D
* Cleaned out some more evals in zh and yue classes :P
* Fixed yue class to call the parent adjustor properly
Comment 6 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-03-02 15:14:34 UTC
The fix for this seems to have caused bug 17733.

Note You need to log in before you can comment on or make changes to this bug.