Last modified: 2014-09-16 11:09:47 UTC
https://gerrit.wikimedia.org/r/78660 Special:SupportedLanguages had to be disabled on Wikimedia wikis due to database queries taking over 90,000 seconds. No idea which query or why. Perhaps a missing index?
The query would have been of the form: SELECT user_name, user_editcount, MAX(rev_timestamp) as lastedit FROM user, revision WHERE user_id = rev_user GROUP BY user_name MAX() is optimised in various contexts, but I guess this isn't one of them. EXPLAIN indicates that this query scans the entire revision table, which would explain the long query times. This ungrouped query is efficient: SELECT MAX(rev_timestamp) FROM revision,user WHERE rev_user=user_id AND user_name='Tim Starling'; So I guess it is the grouping that stops it from hitting a special-case optimisation of MAX(). The traditional way to retrieve a row from the end of a range, which does not rely on MAX() optimisations, is with ORDER BY and LIMIT: SELECT rev_timestamp FROM revision,user WHERE rev_user=user_id AND user_name='Tim Starling' ORDER BY rev_timestamp DESC LIMIT 1 But even if that was done, it would still be extremely inefficient and would not work. All the special page wants to do is display statistics about translators listed on e.g. [[Portal:Fr/translators]] for French translators, and there are no such translator lists on Commons, so actually it is trying to calculate statistics about nobody in order to display an empty table. In order to display that empty table, it is fetching edit count statistics for all 3 million users on the wiki. Even if it managed to get the queries done, the memcached set would fail due to the value size being larger than 1MB. Obviously, it should do a single ungrouped query like the one above for each translator, not for every user on the wiki.
Change 105736 had a related patch set uploaded by Nikerabbit: Optimize Special:SupportedLanguages https://gerrit.wikimedia.org/r/105736
Even the solution without GROUP BY could be slow for very active users. The index used is probably always revision user_timestamp (rev_user, rev_timestamp). Since rev_timestamp is in second place MAX() must still do a range scan on the user's portion of the btree. As we're already proposing to query separately for each user then pulling the user data out first in bulk and removing the join in the individual user queries would be predictable and faster: SELECT user_id, user_editcount FROM user WHERE user_name in (...); foreach $user_id: SELECT MAX(rev_timestamp) FROM revision WHERE rev_user = $user_id; *************************** 1. row *************************** id: 1 select_type: SIMPLE table: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL Extra: Select tables optimized away 1 row in set (0.29 sec) Each user needs a single Handler_read_key hit on revision rev_timestamp (rev_timestamp) index regardless of how many revisions they have authored. Worth considering.
Actually, s/range scan/ref access/, but still potentially thousands of Handler_read_next.
It is actually doing individual queries now, though joining with user table as I didn't have user_id available without more refactoring. I'm happy to do more refactoring to make it faster, but would like to avoid building dependent patchsets.
Change 105736 merged by jenkins-bot: Optimize Special:SupportedLanguages https://gerrit.wikimedia.org/r/105736
This is not an enhancement, it's a bug because it completely prevents the feature from being used on Wikimedia wikis.
On translatewiki.net, time for purge went from 60+ s (to get a timeout) to about 30 (to actually get it). http://www.webpagetest.org/result/140123_M7_FB4/ http://www.webpagetest.org/result/140124_8E_8RV/ Is the performance of the query used on the other wikis (i.e. without translators lists in portals) proportional to this? Maybe to re-enable it we only need to be sure the really-worst case is eliminated (queries 90 000 s long).
I was discussing this with Siebrand and we were planning to drop our special case for using portal pages, but currently that seems to throw us over the 60+ limit again so further work is needed. On the other hand this is not on top of my todo list.
*** Bug 59497 has been marked as a duplicate of this bug. ***