Last modified: 2010-05-15 15:33:00 UTC
This uses an ORDER BY and LIMIT combination which forces retrieval of every record which matches the where before the LIMIT can be used. Changing ORDER BY value DESC to ORDER BY rc_timestamp DESC uses the index and causes only the number of records needed to reach the limit count to be retrieved, much more efficient. Run time difference is from 10 or more seconds now to about 0.01 seconds with the change. If there's objection to using timestamp when there's a chance that timestamp may be duplicated, use rc_timestamp, rc_cur_id and add rc_cur_id to the new_name_timestamp index and I'll change the index on the live Wikimedia sites. For unchanged sites it'll be at least no worse and may still be faster for them. The piece to watch for in what follows is "using filesort". That's the giveaway that the index isn't being used for the limit. Current query: EXPLAIN /* QueryPage::doFeed */ SELECT 'Newpages' as type, rc_namespace AS namespace, rc_title AS title, rc_cur_id AS value, rc_user AS user, rc_user_text AS user_text, rc_comment as comment, rc_timestamp AS timestamp, '0' as usepatrol, rc_patrolled AS patrolled, rc_id AS rcid, length (cur_text) as length, cur_text as text FROM `recentchanges`,`cur` WHERE rc_cur_id=cur_id AND rc_new=1 AND rc_namespace=0 AND cur_is_redirect=0 ORDER BY value DESC LIMIT 0,50\G *************************** 1. row *************************** table: recentchanges type: ref possible_keys: rc_namespace_title,rc_cur_id,new_name_timestamp key: new_name_timestamp key_len: 2 ref: const,const rows: 15962 Extra: Using where; Using filesort *************************** 2. row *************************** table: cur type: eq_ref possible_keys: cur_id key: cur_id key_len: 4 ref: recentchanges.rc_cur_id rows: 1 Extra: Using where Suggested change: EXPLAIN /* QueryPage::doFeed */ SELECT 'Newpages' as type, rc_namespace AS namespace, rc_title AS title, rc_cur_id AS value, rc_user AS user, rc_user_text AS user_text, rc_comment as comment, rc_timestamp AS timestamp, '0' as usepatrol, rc_patrolled AS patrolled, rc_id AS rcid, length (cur_text) as length, cur_text as text FROM `recentchanges`,`cur` WHERE rc_cur_id=cur_id AND rc_new=1 AND rc_namespace=0 AND cur_is_redirect=0 ORDER BY rc_timestamp DESC LIMIT 0,50\G *************************** 1. row *************************** table: recentchanges type: ref possible_keys: rc_namespace_title,rc_cur_id,new_name_timestamp key: new_name_timestamp key_len: 2 ref: const,const rows: 15984 Extra: Using where *************************** 2. row *************************** table: cur type: eq_ref possible_keys: cur_id key: cur_id key_len: 4 ref: recentchanges.rc_cur_id rows: 1 Extra: Using where And adding the rc_cur_id after the timestamp shows it back to using a fielsort with the current index but it won't if rc_cur_id is added to it: EXPLAIN /* QueryPage::doFeed */ SELECT 'Newpages' as type, rc_namespace AS namespace, rc_title AS title, rc_cur_id AS value, rc_user AS user, rc_user_text AS user_text, rc_comment as comment, rc_timestamp AS timestamp, '0' as usepatrol, rc_patrolled AS patrolled, rc_id AS rcid, length (cur_text) as length, cur_text as text FROM `recentchanges`,`cur` WHERE rc_cur_id=cur_id AND rc_new=1 AND rc_namespace=0 AND cur_is_redirect=0 ORDER BY rc_timestamp, rc_cur_id DESC LIMIT 0,50\G *************************** 1. row *************************** table: recentchanges type: ref possible_keys: rc_namespace_title,rc_cur_id,new_name_timestamp key: new_name_timestamp key_len: 2 ref: const,const rows: 16950 Extra: Using where; Using filesort *************************** 2. row *************************** table: cur type: eq_ref possible_keys: cur_id key: cur_id key_len: 4 ref: recentchanges.rc_cur_id rows: 1 Extra: Using where The changed query had a run time of 0.01 seconds. Making the original query immediately after that (still with caching benefit from the fast form) took 5.91 seconds. I noticed the query when I saw this in mytop: 7414082 wikiuser ialrazi:36569 enwiki 10 Query /* QueryPage::doFeed */ SELECT 'Newpages' as type, rc_namespace AS namespace, rc_title AS title, rc_cur_id AS val
Uses timestamp paging now (for a while)