Last modified: 2014-04-13 22:39:50 UTC
MySQL-based search engine used per default does not appear to sort results in any meaningful way. I have written a small extension that extends SearchMySQL4 to use sorting by relevance (attachment follows), but the data set on my personal test wiki is not well suited to test it. I'm a bit confused about MySQL fulltext search, and thus this extension may be completely pointles. The relevant documentation is at <http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html>. A few observations: * SearchMySQL4 uses the IN BOOLEAN MODE modifier (http://dev.mysql.com/doc/refman/4.1/en/fulltext-boolean.html). This appears to cause MySQL to report the relevance at 1.0 for anything that matches, making my patch pointles. The documentation confirms this behaviour: " They do not automatically sort rows in order of decreasing relevance". This also confirms the problem this bug report tires to address. * After some testing, the way to get a weighted search result with boolean matching appears to be this: SELECT page_id, page_namespace, page_title, MATCH(si_text) AGAINST('Quux') as rank FROM `page`,`searchindex` WHERE page_id=si_page AND MATCH(si_text) AGAINST('Quux' IN BOOLEAN MODE) AND page_is_redirect=0 AND page_namespace IN (0) ORDER BY rank DESC * For some reason though, this "sometimes" gives a rank of zero (but still a boolean match) on entries that contain the search string (maybe a wordlength limit? seems unlikely though for the things i've tried). Consequently, not using the BOOLEAN modifier at all causes some matches (the ones with rank 0) not to show. As I said, I'm a bit confused, but this is probably worth looking into. The search feature would be vastly more useful with decent ranking.
Created attachment 1761 [details] extension modifying SearchMSQL4 to order by rank (not really functional, see initial comment)
Can this very old report be now seen under the light of Cirrus Search?
Nope, Cirrus has nothing to do with the core database-backed search implementation. It's up to core to implement this if it's still desired. It's not even a problem in Cirrus/MWSearch world at all.