Last modified: 2010-05-15 15:48:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 15229 - Postgres - Searches that include a number as a term always return no results
Postgres - Searches that include a number as a term always return no results
Status: RESOLVED WORKSFORME
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
1.9.x
PC OpenBSD
: Low minor (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: postgres
  Show dependency treegraph
 
Reported: 2008-08-18 17:49 UTC by Reed Hedges
Modified: 2010-05-15 15:48 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Reed Hedges 2008-08-18 17:49:27 UTC
I couldn't find anything about this in the MediaWiki manual, or in Bugzilla, in the Release Notes for 1.13, or in Google:

If I include a number (a numeral) in a search query, or a word that starts with a number, I always get 0 results, even though I know I have pages that have that number in them.  For example, if I have a wiki page about 64-bit PC architecture, I can search for "bit" and "pc" and "bit pc" and "pc bit" and see the page in the results, but if I search for "64 bit" or "64bit" or "64bit PC" or "64 bit architecture" or "pc 64 bit architecture" I always get 0 results.  

Thanks



    * MediaWiki: 1.9.1
    * PHP: 5.2.3 (apache)
    * PostgreSQL: PostgreSQL 8.2.4 on x86_64-unknown-openbsd4.2, compiled by GCC cc (GCC) 3.3.5 (propolice) 

Extensions

Parser hooks
ParserFunctions (version 1.1.1)	Enhance parser with logical functions	Tim Starling

Other
ConfirmEdit	Simple captcha implementation	Brion Vibber

Extension functions
wfSetupParserFunctions, ceSetup and efReCaptcha

Parser extension tags
<pre>

Parser function hooks
if, ifeq, switch, ifexist, ifexpr, iferror, expr, time, timel, rel2abs, titleparts, int, ns, urlencode, lcfirst, ucfirst, lc, uc, localurl, localurle, fullurl, fullurle, formatnum, grammar, plural, numberofpages, numberofusers, numberofarticles, numberoffiles, numberofadmins, language, padleft, padright, anchorencode, special, defaultsort and displaytitle

Hooks
Hook name	Subscribed by
AbortLogin	(ReCaptcha, confirmUserLogin)
AbortNewAccount	(ReCaptcha, confirmUserCreate)
EditFilter	(ReCaptcha, confirmEdit)
LanguageGetMagic	wfAddCustomVariableLang and wfParserFunctionsLanguageGetMagic
LoginAuthenticateAudit	(ReCaptcha, triggerUserLogin)
MagicWordMagicWords	wfAddCustomVariable
MagicWordwgVariableIDs	wfAddCustomVariableID
ParserClearState	(ExtParserFunctions, clearState)
ParserGetVariableValueSwitch	wfGetCustomVariable
UserCreateForm	(ReCaptcha, injectUserCreate)
UserLoginForm	(ReCaptcha, injectUserLogin)
Comment 1 Brion Vibber 2008-08-18 17:51:51 UTC

*** This bug has been marked as a duplicate of bug 42 ***
Comment 2 Brion Vibber 2008-08-20 17:23:59 UTC
Un-duping since this is actually a Postgres issue (different search engine backend).
Comment 3 Reed Hedges 2008-08-20 17:58:53 UTC
Looks like Wikipedia uses Extension:MWSearch and Lucene right?  So that would be a workaround for this bug (and bug 42?).  Are there any other search engines available other than Lucene and the basic SQL search?
Comment 4 Greg Sabino Mullane 2008-08-20 18:14:17 UTC
Works fine for me. Is there anything showing up in the Postgres error logs? You could also try running the search commands manually by looking at SearchPostgres.php or changing a query extracted from the Postgres logs. As an aside, you should upgrade to 8.2.9 (it's a quick binary drop-in).

Actually, the best debugging may be to set log_statement = 'all' in your postgresql.conf, HUP, and grab the exact query it's running when you try one of the no-result queries, then run it via psql to see what's going on. Try #postgresql on freenode if that doesn't yield anything obvious.
Comment 5 Reed Hedges 2008-08-20 19:44:34 UTC
I don't see anything that jumps out at me in the log file after setting log_statement = 'all', unless the to_tsquery procedure is doing something weird with the number. The server admin is going to upgrade everything (MediaWiki, PostgreSQL, even new hardware I think) and I'll see what happens then.
Comment 6 Reed Hedges 2008-08-25 19:21:01 UTC
Here's another clue:

I can use ">64" or "<64" in my query and it works (where pages contain the number 64)!  Do ">" or "<" have special meaning in a search?
Comment 7 Reed Hedges 2008-08-25 19:26:30 UTC
By the way, here's an excerpt from the PostgreSQL log when we use log_statement = 'all' and I search for "64 bit"

LOG:  statement: SELECT /* User::loadFromDatabase */  *  FROM mwuser  WHERE user_id = '4'  LIMIT 1 
LOG:  statement: SELECT /* User::loadFromDatabase */  ug_group  FROM user_groups  WHERE ug_user = '4' 
LOG:  statement: SELECT /* MediaWikiBagOStuff::_doquery */ value,exptime FROM objectcache WHERE keyname='wikidb:messages-hash'
LOG:  statement: SELECT /* MediaWikiBagOStuff::_doquery */ value,exptime FROM objectcache WHERE keyname='wikidb:messages'
LOG:  statement: SELECT /* Parser::replaceLinkHolders */ page_id, page_namespace, page_title FROM page WHERE (page_namespace=0 AND page_title IN('64_bit'))
LOG:  statement: SELECT /* Parser::replaceLinkHolders */ page_id, page_namespace, page_title FROM page WHERE (page_namespace=0 AND page_title IN('Contact_Us', 'Categories'))
LOG:  statement: SELECT page_id, page_namespace, page_title, old_text AS page_text, rank(titlevector, to_tsquery('default','+64&+bit')) AS rnk FROM page p, revision r, pagecontent c WHERE p.page_latest = r.rev_id AND r.rev_text_id = c.old_id AND titlevector @@ to_tsquery('default','+64&+bit') AND page_is_redirect = 0 AND page_namespace IN (0) ORDER BY rnk DESC, page_id DESC LIMIT 20  OFFSET 0 
LOG:  statement: SELECT page_id, page_namespace, page_title, old_text AS page_text, rank(titlevector, to_tsquery('default','+64&+bit')) AS rnk FROM page p, revision r, pagecontent c WHERE p.page_latest = r.rev_id AND r.rev_text_id = c.old_id AND textvector @@ to_tsquery('default','+64&+bit') AND page_is_redirect = 0 AND page_namespace IN (0) ORDER BY rnk DESC, page_id DESC LIMIT 20  OFFSET 0 
LOG:  statement: SELECT /* Job::pop */  *  FROM job   ORDER BY job_id LIMIT 1 
LOG:  statement: SELECT /* LinkBatch::doQuery */ page_id, page_namespace, page_title FROM page WHERE (page_namespace=2 AND page_title IN ('ReedHedges')) OR (page_namespace=3 AND page_title IN ('ReedHedges')) OR (page_namespace=0 AND page_title IN ('Search'))
LOG:  statement: SELECT /* User::checkNewtalk */  user_id  FROM user_newtalk  WHERE user_id = '4'  LIMIT 1 


And when I search for ">64 bit":

LOG:  statement: SELECT /* User::loadFromDatabase */  *  FROM mwuser  WHERE user_id = '4'  LIMIT 1 
LOG:  statement: SELECT /* User::loadFromDatabase */  ug_group  FROM user_groups  WHERE ug_user = '4' 
LOG:  statement: SELECT /* MediaWikiBagOStuff::_doquery */ value,exptime FROM objectcache WHERE keyname='wikidb:messages-hash'
LOG:  statement: SELECT /* MediaWikiBagOStuff::_doquery */ value,exptime FROM objectcache WHERE keyname='wikidb:messages'
LOG:  statement: SELECT /* Parser::replaceLinkHolders */ page_id, page_namespace, page_title FROM page WHERE (page_namespace=0 AND page_title IN('Contact_Us', 'Categories'))
LOG:  statement: SELECT page_id, page_namespace, page_title, old_text AS page_text, rank(titlevector, to_tsquery('default','>64&+bit')) AS rnk FROM page p, revision r, pagecontent c WHERE p.page_latest = r.rev_id AND r.rev_text_id = c.old_id AND titlevector @@ to_tsquery('default','>64&+bit') AND page_is_redirect = 0 AND page_namespace IN (0) ORDER BY rnk DESC, page_id DESC LIMIT 20  OFFSET 0 
LOG:  statement: SELECT page_id, page_namespace, page_title, old_text AS page_text, rank(titlevector, to_tsquery('default','>64&+bit')) AS rnk FROM page p, revision r, pagecontent c WHERE p.page_latest = r.rev_id AND r.rev_text_id = c.old_id AND textvector @@ to_tsquery('default','>64&+bit') AND page_is_redirect = 0 AND page_namespace IN (0) ORDER BY rnk DESC, page_id DESC LIMIT 20  OFFSET 0 
LOG:  statement: SELECT /* Revision::fetchRow */  page_namespace,page_title,page_latest,rev_id,rev_page,rev_text_id,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_timestamp,rev_deleted  FROM page,revision  WHERE (rev_id=page_latest) AND (page_id=rev_page) AND page_namespace = '0' AND page_title = 'Installing_Debian_Packages_on_64-bit_OS'  LIMIT 1 
LOG:  statement: SELECT /* Revision::loadText */  old_text,old_flags  FROM pagecontent  WHERE old_id = '1020'  LIMIT 1 
LOG:  statement: SELECT /* Revision::fetchRow */  page_namespace,page_title,page_latest,rev_id,rev_page,rev_text_id,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_timestamp,rev_deleted  FROM page,revision  WHERE (rev_id=page_latest) AND (page_id=rev_page) AND page_namespace = '0' AND page_title = 'Installing_Debian_Packages_on_64-bit_OS'  LIMIT 1 
LOG:  statement: SELECT /* Revision::loadText */  old_text,old_flags  FROM pagecontent  WHERE old_id = '1020'  LIMIT 1 
LOG:  statement: SELECT /* Revision::fetchRow */  page_namespace,page_title,page_latest,rev_id,rev_page,rev_text_id,rev_comment,rev_user_text,rev_user,rev_minor_edit,rev_timestamp,rev_deleted  FROM page,revision  WHERE (rev_id=page_latest) AND (page_id=rev_page) AND page_namespace = '0' AND page_title = 'RB_(SAY)_example_code'  LIMIT 1 
LOG:  statement: SELECT /* Revision::loadText */  old_text,old_flags  FROM pagecontent  WHERE old_id = '430'  LIMIT 1 
LOG:  statement: SELECT /* Job::pop */  *  FROM job   ORDER BY job_id LIMIT 1 
LOG:  statement: SELECT /* LinkBatch::doQuery */ page_id, page_namespace, page_title FROM page WHERE (page_namespace=2 AND page_title IN ('ReedHedges')) OR (page_namespace=3 AND page_title IN ('ReedHedges')) OR (page_namespace=0 AND page_title IN ('Search'))
LOG:  statement: SELECT /* User::checkNewtalk */  user_id  FROM user_newtalk  WHERE user_id = '4'  LIMIT 1 
Comment 8 Reed Hedges 2008-08-25 19:37:12 UTC
Note the different arguments to to_tsquery(): to_tsquery('default','+64&+bit') vs. to_tsquery('default','>64&+bit').
Comment 9 Reed Hedges 2008-09-15 20:00:42 UTC
I upgraded to MediaWiki 1.13.1 and the problem seems to have gone away.  (Curious though what fix between 1.9 and 1.13 did it though, couldn't find anything in the release notes for any of the intervening versions...)
Comment 10 Reed Hedges 2008-12-29 13:25:16 UTC
I am closing this report with WORKSFORME, since it *appears* to be fixed, though it would be good to find out what changed between MediaWiki 1.9 and 1.13.1 that would have fixed this problem, and include it in the release notes/change logs (or clarify the wording there if it is included).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links