Last modified: 2013-06-05 07:08:55 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3138, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1138 - ipblocks table ipb_timestamp index added; locking of whole table for update
ipblocks table ipb_timestamp index added; locking of whole table for update
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
All All
: Normal normal with 1 vote (vote)
: ---
Assigned To: Nicolas Weeger
: patch
Depends on:
  Show dependency treegraph
Reported: 2004-12-20 20:41 UTC by Jamesday
Modified: 2013-06-05 07:08 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---

Change block check behaviour (7.94 KB, patch)
2005-01-24 22:34 UTC, Nicolas Weeger

Description Jamesday 2004-12-20 20:41:50 UTC
After seeing a large number of queries for all records in ipblocks with order
ipb_timestamp desc I today added the index ipb_timestamp (ipb_timestamp desc) to
all databases on the Wikimedia servers. This changes the query from one using no
index and a filesort to get the order to one using the new index. The queries
were happening for both 1.4 and pre-1.4 projects. Block::enumBans is the query
comment on the 1.4 wikis. Seems likely to be of assistance for other mediawiki

In addition, the ipblocks in 1.4 showed a FOR UPDATE query which wanted every
record in the table. Is it really necessary to get an exclusive read and write
lock on the whole table? Maybe do it in ipb_address like '1%' then like '2%' and
so on to like '9%' chunks, so most lock checks can continue while some are being
updated? Perhaps only after a check without for update shows more than 50 blocks
Comment 1 Jamesday 2005-01-18 10:17:01 UTC
This one matters because stupid crawlers read the edit page and can stress the
master. Should ideally ask a slave when loading the edit page, with no lock
needed, then check the master only when the edit is being saved.

Occasionally this change will let a block last too long (the slave check, if the
slave is lagged) or let someone who has just been blocked type in an edit and
not save it, but those costs seem reasonable enough to avoid hurting the master.
Comment 2 Jamesday 2005-01-18 10:40:43 UTC
For Ryo, desired behavior is:

When fetching edit page:
1. check memcached to see if the user is shown as blocked. If not blocked, give
them the edit page.

When saving the edit:
1. Check memcached to see if they are blocked now
2. if not blocked, check a slave to see if they are blocked
3. if still not bloocked, check the master to see if blocked there, and save the
edit if not blocked.

This way a crawler which loads the edit page will only cause a check of
memcached, but we'll stll get accurate block results to prevent the edit from
being saved if memcached happens to be out of date becaue of the race condition
Tim mentioned. We check the slave before the master because it's much cheaper to
check slaves and save the master work - it's easier to add more slave power than
master power.

Memcached can be loaded from a slave because we won't care if it is a little bit
out of date - we'll catch that when the user tries to edit.
Comment 3 Nicolas Weeger 2005-01-21 09:18:59 UTC
After discussion with Tim, here's the behaviour i'll implement:
* when editing/previewing a page, check with slave
* when saving the page, check with master

So no more memcache checking.

I'd also suggest getting rid of block checking when simply previewing the page -
may be less user-friendly to learn you're blocked only when saving, though.
Comment 4 Nicolas Weeger 2005-01-24 22:34:41 UTC
Created attachment 220 [details]
Change block check behaviour

Modified BlockCache, EditPage and User.

The behaviour is now:
* when previewing a page, don't check the blocked status.
* when requesting edition page (click on edit link), check blocked status
against slave.
* when saving the page, check blocked status against first slave, then if not
blocked against master.

Memcache is not used anymore.
Comment 5 Zigger 2005-07-10 13:21:05 UTC
This was added to CVS in February 2005 by 'ryo_saeba', and is in 1.5.
Comment 6 Tim Starling 2013-06-05 07:08:55 UTC
For the record, the relevant performance optimisation was accidentally removed in r24562 (August 2007). Since then, edit page requests have checked blocked status against the master DB, with no memcached. We have indexes now, and the FOR UPDATE clause was removed, so there was no disaster. But it is the third most common master query according to WMF profiling, if you don't count BEGIN.

Note You need to log in before you can comment on or make changes to this bug.