Last modified: 2010-05-15 15:50:55 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 10699 - full text search phrases
full text search phrases
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
1.10.x
All All
: Normal enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
: easy, patch, patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-07-25 19:32 UTC by Ben Lentz
Modified: 2010-05-15 15:50 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
hack to add double quote search phrase functionality (887 bytes, patch)
2007-07-25 19:32 UTC, Ben Lentz
Details

Description Ben Lentz 2007-07-25 19:32:41 UTC
Created attachment 3944 [details]
hack to add double quote search phrase functionality

I have a pool of users who are nervous to adopt our MediaWiki installation as a corporate document repository due to a problem using search phrases. Our user community wishes to enter double-quote enclosed search strings to search for multiple works that are adjacent to each other, instead of entering multiple search terms which all may exist separately in different parts of the document.

I was surprised to find on the http://www.mediawiki.org/wiki/Help:Searching and http://en.wikipedia.org/wiki/Help:Searching pages that this isn't a supported function:

Even if you enclose a phrase in quotes, the search looks for each word individually. e.g. if you enter "world war 2" it will return pages that contain "world" and "war" and "2".

Phrase: There is no method for searching for a phrase. Contrary to what you might expect, enclosing phrases in double quotation marks such as "can of tuna" will retrieve all pages containing "of" "tuna" and "can".

I was even more surprised when I realized that my installation had full text support in MySQL 4.1+, using the Boolean Full-Text Searches option. From http://dev.mysql.com/doc/refman/4.1/en/fulltext-boolean.html:

A phrase that is enclosed within double quote (‘"’) characters matches only rows that contain the phrase literally, as it was typed.

Though some digging, I found that MediaWiki is actually stripping the double-quote characters out, even though the user performing the search intended to have them in place and the underlying database search functions support it.

I have hacked together a patch to make this work for the user community, but it's got some limitations (obvious upon review to those who know the code). If there's any chance that search phrases will be implemented in a future version of MediaWiki, please let me know. Our user base thinks they've got a real need for this feature.

Thanks in advance
Comment 1 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-07-25 19:44:39 UTC
Note that Wikipedia doesn't actually use MySQL fulltext search, it uses the Lucene extension.  The latest version of the LuceneSearch extension appears to support this: compare a search for 'lower learning' and '"lower learning"' on Wikipedia.  The former returns vastly more results, 7844 instead of 4.

This seems reasonable as a patch to the default fulltext search, but I don't know much of anything about MySQL fulltext search or our support for it, so I'll ask someone else to look at it.
Comment 2 Brion Vibber 2007-09-11 18:51:32 UTC
Fixed in r25794, along with bug 4021.

Phrase search was always meant to work, but never quite made it through the filtering stage, whoops. :P

Now using an expanded set of chars for the filter, and the original for parsing through the query to get regexes for the result highlighting.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links