Last modified: 2011-03-13 18:05:45 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T18664, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 16664 - Updating searchindex locks up database
Updating searchindex locks up database
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
unspecified
All All
: Lowest major (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-12-16 04:00 UTC by Sean Colombo
Modified: 2011-03-13 18:05 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Sean Colombo 2008-12-16 04:00:08 UTC
Recently, on our fairly high-traffic wiki, the searchindex table is getting locked during updating for long enough (and enough of the queries are on this table) that the whole system comes to a halt.

There were no bug-reports or relevant google results or anyone who seemed to recognize the problem in IRC, so I'll attempt to give what little info I have:

The configuration I'm running is a squid server in front of two apache servers which connect to a master and slave database (each on separate boxes).  There are about 800,000 rows in wiki_page and the total database is around 9.5gigs (if that matters?).  When the problem arises, "SHOW PROCESSLIST" on the master database shows hundreds of rows, most of which are "Locked" and waiting to update or read from the searchindex table.

Here is an excerpt from "SHOW PROCESSLIST":
<pre>
| 28465390 | username    | pedlfaster.pedlr.com:45909 | lyricwiki | Query       |   281 | Locked                                                         | SELECT /*  70.179.32.165 */ page_id, page_namespace, page_title FROM `wiki_page`,`wiki_searchindex`  | 
| 28465985 | username    | pedlfaster.pedlr.com:46128 | lyricwiki | Query       |   269 | Locked                                                         | SELECT /*  189.242.7.203 */ page_id, page_namespace, page_title FROM `wiki_page`,`wiki_searchindex`  | 
| 28466171 | username    | pedlfaster.pedlr.com:46205 | lyricwiki | Query       |   196 | Locked                                                         | REPLACE /* SearchMySQL::update AUserOfTheSite */ INTO `wiki_searchindex` (si_page,si_title,si_text) VA | 
| 28466541 | username    | pedlfaster.pedlr.com:46298 | lyricwiki | Query       |   254 | Locked                                                         | SELECT /*  189.242.7.203 */ page_id, page_namespace, page_title FROM `wiki_page`,`wiki_searchindex`  | 
| 28467720 | username    | pedlfaster.pedlr.com:46554 | lyricwiki | Query       |   229 | Locked                                                         | SELECT /*  190.47.145.47 */ page_id, page_namespace, page_title FROM `wiki_page`,`wiki_searchindex`  | 
| 28467861 | username    | pedlfaster.pedlr.com:46589 | lyricwiki | Query       |   226 | Locked                                                         | SELECT /*  66.53.228.221 */ page_id, page_namespace, page_title FROM `wiki_page`,`wiki_searchindex`  | 
| 28467905 | username    | pedlfaster.pedlr.com:46612 | lyricwiki | Query       |   225 | Locked                                                         | SELECT /*  82.34.209.240 */ page_id, page_namespace, page_title FROM `wiki_page`,`wiki_searchindex`  | 
</pre>

At the time that the "SHOW PROCESSLIST" was issued, there were 303 rows returned (all but about 7 appeared to be waiting on the searchindex table) and the site was unaccessable from either Apache or through the Squid (Squid was returning the message along the lines of "this page can't be accessed").

I know this isn't much to go on.  I would be happy to give more information if there are questions.
Comment 1 Brion Vibber 2008-12-18 19:12:20 UTC
Indeed, this sort of thing is why we don't use the MySQL search backend on high-traffic sites like Wikipedia.

Consider an alternate search backend such as the Lucene (MWSearch) or Sphinx search extensions.

Alternately, you can disable online updates and run the index updates in a batch during off-peak hours.
Comment 2 Sean Colombo 2009-01-01 19:46:21 UTC
Thanks for the info.  We'll switch to Lucene.

Since the slowness is just inherent in the way it is done by default (& it's a reasonable default) and that special-cases need to either do the batch update or switch to custom search engines it sounds like a "WONTFIX" is appropriate here.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links