Last modified: 2005-01-29 04:42:35 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3368, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1368 - Bad SQL syntax with discarded search terms [MySQL 3.x]
Bad SQL syntax with discarded search terms [MySQL 3.x]
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
unspecified
PC Windows XP
: Normal normal with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
: 1357 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-01-19 22:06 UTC by Vasco Alexandre da Silva Costa
Modified: 2005-01-29 04:42 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Vasco Alexandre da Silva Costa 2005-01-19 22:06:19 UTC
Hello,

I am using mediawiki-1.4beta5 on MySQL 3.23.58.

When searching for some words, but curiously not all, I get this bug:

---
A database query syntax error has occurred. This may indicate a bug in the
software. The last attempted database query was:

    SELECT cur_id, cur_namespace, cur_title, cur_text FROM `cur`,`searchindex`
WHERE cur_id=si_page AND AND cur_namespace IN (0,11) LIMIT 0,20 

from within function "". MySQL returned error "1064: You have an error in your
SQL syntax near 'AND cur_namespace IN (0,11) LIMIT 0,20 ' at line 1 (localhost)".
---

I fixed this by changing this line in queryMain() at SearchMySQL3.php:
'WHERE cur_id=si_page AND ' . $match;

to:
'WHERE cur_id=si_page ' . $match;
Comment 1 Vasco Alexandre da Silva Costa 2005-01-23 02:19:29 UTC
(In reply to comment #0)
> I fixed this by changing this line in queryMain() at SearchMySQL3.php:
> 'WHERE cur_id=si_page AND ' . $match;
> 
> to:
> 'WHERE cur_id=si_page ' . $match;

Scratch that fix, it fixes some searches and breaks others. A proper fix needs
to be devised.
Comment 2 Brion Vibber 2005-01-23 02:21:33 UTC
Are the failures with words less than four letters or on the blacklist in FulltextStoplist.php?
Comment 3 Vasco Alexandre da Silva Costa 2005-01-23 18:26:22 UTC
(In reply to comment #2)
> Are the failures with words less than four letters or on the blacklist in
FulltextStoplist.php?

The problem word was six letters long (the actual query was "wonder" as in
Wonder of the World").
It is indeed in FulltextStoplist.php.

So, these words are dropped from searches? That seems a bit much (I mean, it has
words like "above" in it, what if I was searching for "Space Above and Beyond"?
only "space" isn't in that wordlist). Why not just stop the search after a
certain number of hits is found?

Thanks for the quick response.
Comment 4 Brion Vibber 2005-01-23 23:16:34 UTC
They are dropped from the search because the search would return 
*zero* results if they were included. The stop list and a (somewhat 
configurable) length limit is built into MySQL's full text search engine; 
if you search for a word on the stoplist or a too-short word no results 
are returned for your query.

Under MySQL 3.x we combine individual searches for multiple words 
with a boolean 'and' condition; this means searching for "the" and 
"beatles" will take 0 results for "the" and 1000 results for "beatles" and 
combine them to return... nothing. Thus words that we know will not 
match are stripped out prior to search, so the results for potentially 
matching terms will still be shown.

However there seems to be a problem with this stripping and the 
generated query is bogus, at least when you end up with _no_ terms to 
search for.

Under MySQL 4.x we use MySQL's expanded built-in boolean search 
mode which can ignore the blacklisted terms at its end.
Comment 5 Vasco Alexandre da Silva Costa 2005-01-24 15:41:36 UTC
>They are dropped from the search because the search would return 
>*zero* results if they were included. The stop list and a (somewhat 
>configurable) length limit is built into MySQL's full text search engine; 
>if you search for a word on the stoplist or a too-short word no results 
>are returned for your query.(In reply to comment #4)
> They are dropped from the search because the search would return 
> *zero* results if they were included. The stop list and a (somewhat 
> configurable) length limit is built into MySQL's full text search engine; 
> if you search for a word on the stoplist or a too-short word no results 
> are returned for your query.

So you are telling me that there is no way to search for these words in MySQL?
I wonder why it worked fine for that blacklisted word search when I made the change
in my first post. While if I remove that word from the blacklist, I get zero
results.
I do not remember having had prior user complaints when using 1.3.9 either.

Seems like a pretty inane MySQL feature. If I upgrade MySQL to version 4.x, will
it be
possible to actually search for these words and get results?
Comment 6 Brion Vibber 2005-01-28 00:40:14 UTC
Bug now fixed in CVS, valid queries will be issued. You still won't get results though. ;)

(In reply to comment #5)
> So you are telling me that there is no way to search for these words in MySQL?

You can if you remove them from the stop word list. (Also ignored are words appearing in over 50% of the searchable pages.)

> I wonder why it worked fine for that blacklisted word search when I made the change
> in my first post.

That modifed query would have returned every page in the database that matched the namespace check.

> While if I remove that word from the blacklist, I get zero results.

The stopword list is built into MySQL. Our check is to remove known stoplist words before we run the query, so that other 
queried words are able to return results. If you search for a stoplisted word anyway (by removing it from MediaWiki's copy of 
the stoplist), then MySQL will return no results.

> I do not remember having had prior user complaints when using 1.3.9 either.

1.3.9 does not contain this bug.

> Seems like a pretty inane MySQL feature. If I upgrade MySQL to version 4.x, will
> it be possible to actually search for these words and get results?

The boolean search in MySQL 4 may or may not adhere to the stopword list in the same fashion. MediaWiki does not attempt to 
enforce the list on its end when set for MySQL 4 because this is not required (other words searched still return results.) If 
MySQL itself is still not returning results for that word, then you must modify MySQL's stopword list. See the MySQL 
documentation for how to do this on dev.mysql.com.
Comment 7 Zigger 2005-01-29 04:42:35 UTC
*** Bug 1357 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links