Last modified: 2010-05-15 15:33:07 UTC
Here are the current most significant issues for the Wikimedia database servers, in approximate importance order. To qualify for this list it must be an issue which I've noticed as a significant issue on the Wikimedia servers. If you're after smallish tasks which can help performance, you'll probaby find some here... 1. Making queries at the master instead of the slave when it is not an update operation and it is not essential to have current data (getting the most recent history or watchlist or recent changes is not essential). It is is not an update it probably doesn't belong on the master. Few exceptions, like showing the article history to the person who just edited it or showing an article from the master after a brief master_pos_wait at the slave still failed to find it at all. Anything but a complete failure should stay at the slave, because the usual slave failure scenario makes _all_ slaves have out of date data and the master can't handle the whole load of the site: going to the master would take the site down. Tim Starling has been doing some work to help selectivity. 2. Search is using an inefficient query. A more efficient one (which sends data in the query to avoid a join which makes it far, far slower) is in one of the bug reports. 3. Watchlists and any other query which needs the title of an article, because the current schema includes the text in cur. This is being changed, so these aren't worth wondering about now. 4. Use of limit for paging where there may be tens of thousands of results to be skipped to get to the right place. Applies to article histories, user contributions, lists of what is in a category, any other place. If an item can have tens of thousands of members, it must have efficient paging of some sort. 5. Not using reporting summaries. The query cache can't cache tables like cur which are changed regularly for much time, because the first change removes the query from the cache. Bbut many semi-static queries use that live data even when the source is a report which has data produced in batch runs. Things like the allpages special report. If those used a static reporting table for article titles the query cache could be used, because it would only be updated peridically and wouldn't be flushed from the cache regularly. For allpages, using the live data is actually a bug, because new entris can be added which cause missing entries when the number of articles in a range exceeds 500 because of new articles. Often those who are not logged in just don't care about the latest data and don't need to be shown, say, the latest list of members of a category, when one from an hour ago is available and could be served from a cachable reporting table. 6. Including fields which are changed far more often than the rest of the record in the main tables. Things like cur_touhed in cur or the new patrolled field in watchlist. This decreases the cachability of the data and increases the disk write load, one of the medium term (6 months+) performance limits. MySQL stores data in rows so forcing a larger row update than necessary is a usually a bad idea. Task-specific tables and joins are good...:) 7, Updating fields in "hot" tables long after the original record was written. The rc_old_id field is an example of this. It increases the lock contention rate for the table. Try to put data updated later into a different table. This also unncessarily increases the disk write load and decreases the number of records which can be cached.
Jamesday, is that one fixed ? :o)
Closing this as INVALID. It really seems that this would've been best to post to a mailing list originally. It claims to be a tracking bug but lists no dependencies or blocks. :-) Additionally, at this point, the information here is outdated, inaccurate, and not serving any particular purpose that I can see.