Last modified: 2014-04-13 22:39:50 UTC
MySQL-based search engine used per default does not appear to sort results in
any meaningful way. I have written a small extension that extends SearchMySQL4
to use sorting by relevance (attachment follows), but the data set on my
personal test wiki is not well suited to test it.
I'm a bit confused about MySQL fulltext search, and thus this extension may be
completely pointles. The relevant documentation is at
<http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html>. A few observations:
* SearchMySQL4 uses the IN BOOLEAN MODE modifier
(http://dev.mysql.com/doc/refman/4.1/en/fulltext-boolean.html). This appears to
cause MySQL to report the relevance at 1.0 for anything that matches, making my
patch pointles. The documentation confirms this behaviour: " They do not
automatically sort rows in order of decreasing relevance". This also confirms
the problem this bug report tires to address.
* After some testing, the way to get a weighted search result with boolean
matching appears to be this:
SELECT page_id, page_namespace, page_title,
MATCH(si_text) AGAINST('Quux') as rank
AND MATCH(si_text) AGAINST('Quux' IN BOOLEAN MODE)
AND page_namespace IN (0)
ORDER BY rank DESC
* For some reason though, this "sometimes" gives a rank of zero (but still a
boolean match) on entries that contain the search string (maybe a wordlength
limit? seems unlikely though for the things i've tried). Consequently, not using
the BOOLEAN modifier at all causes some matches (the ones with rank 0) not to show.
As I said, I'm a bit confused, but this is probably worth looking into. The
search feature would be vastly more useful with decent ranking.
Created attachment 1761 [details]
extension modifying SearchMSQL4 to order by rank (not really functional, see initial comment)
Can this very old report be now seen under the light of Cirrus Search?
Nope, Cirrus has nothing to do with the core database-backed search implementation. It's up to core to implement this if it's still desired.
It's not even a problem in Cirrus/MWSearch world at all.