Last modified: 2010-05-15 15:33:00 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3481, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1481 - QueryPage::doFeed SELECT 'Newpages' as type ...
QueryPage::doFeed SELECT 'Newpages' as type ...
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Special pages (Other open bugs)
1.4.x
All All
: Normal normal with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: feeds
  Show dependency treegraph
 
Reported: 2005-02-06 20:13 UTC by Jamesday
Modified: 2010-05-15 15:33 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Jamesday 2005-02-06 20:13:18 UTC
This uses an ORDER BY and LIMIT combination which forces 
retrieval of every record which matches the where before the 
LIMIT can be used. Changing ORDER BY value DESC to ORDER BY 
rc_timestamp DESC uses the index and causes only the number 
of records needed to reach the limit count to be retrieved, 
much more efficient. Run time difference is from 10 or more 
seconds now to about 0.01 seconds with the change.

If there's objection to using timestamp when there's a 
chance that timestamp may be duplicated, use rc_timestamp, 
rc_cur_id and add rc_cur_id to the new_name_timestamp index 
and I'll change the index on the live Wikimedia sites. For 
unchanged sites it'll be at least no worse and may still be 
faster for them.

The piece to watch for in what follows is "using filesort". 
That's the giveaway that the index isn't being used for the 
limit.

Current query:

 EXPLAIN /* QueryPage::doFeed */ SELECT 'Newpages' as type, 
rc_namespace AS namespace, rc_title AS title, rc_cur_id AS 
value, rc_user AS user, rc_user_text AS user_text, 
rc_comment as comment, rc_timestamp AS timestamp, '0' as 
usepatrol, rc_patrolled AS patrolled, rc_id AS rcid, length
(cur_text) as length, cur_text as text FROM 
`recentchanges`,`cur` WHERE rc_cur_id=cur_id AND rc_new=1 
AND rc_namespace=0 AND cur_is_redirect=0 ORDER BY value DESC 
LIMIT 0,50\G
*************************** 1. row 
***************************
        table: recentchanges
         type: ref
possible_keys: 
rc_namespace_title,rc_cur_id,new_name_timestamp
          key: new_name_timestamp
      key_len: 2
          ref: const,const
         rows: 15962
        Extra: Using where; Using filesort
*************************** 2. row 
***************************
        table: cur
         type: eq_ref
possible_keys: cur_id
          key: cur_id
      key_len: 4
          ref: recentchanges.rc_cur_id
         rows: 1
        Extra: Using where

Suggested change:

EXPLAIN /* QueryPage::doFeed */ SELECT 'Newpages' as type, 
rc_namespace AS namespace, rc_title AS title, rc_cur_id AS 
value, rc_user AS user, rc_user_text AS user_text, 
rc_comment as comment, rc_timestamp AS timestamp, '0' as 
usepatrol, rc_patrolled AS patrolled, rc_id AS rcid, length
(cur_text) as length, cur_text as text FROM 
`recentchanges`,`cur` WHERE rc_cur_id=cur_id AND rc_new=1 
AND rc_namespace=0 AND cur_is_redirect=0 ORDER BY 
rc_timestamp DESC LIMIT 0,50\G
*************************** 1. row 
***************************
        table: recentchanges
         type: ref
possible_keys: 
rc_namespace_title,rc_cur_id,new_name_timestamp
          key: new_name_timestamp
      key_len: 2
          ref: const,const
         rows: 15984
        Extra: Using where
*************************** 2. row 
***************************
        table: cur
         type: eq_ref
possible_keys: cur_id
          key: cur_id
      key_len: 4
          ref: recentchanges.rc_cur_id
         rows: 1
        Extra: Using where

And adding the rc_cur_id after the timestamp shows it back 
to using a fielsort with the current index but it won't if 
rc_cur_id is added to it:

EXPLAIN /* QueryPage::doFeed */ SELECT 'Newpages' as type, 
rc_namespace AS namespace, rc_title AS title, rc_cur_id AS 
value, rc_user AS user, rc_user_text AS user_text, 
rc_comment as comment, rc_timestamp AS timestamp, '0' as 
usepatrol, rc_patrolled AS patrolled, rc_id AS rcid, length
(cur_text) as length, cur_text as text FROM 
`recentchanges`,`cur` WHERE rc_cur_id=cur_id AND rc_new=1 
AND rc_namespace=0 AND cur_is_redirect=0 ORDER BY 
rc_timestamp, rc_cur_id DESC LIMIT 0,50\G
*************************** 1. row 
***************************
        table: recentchanges
         type: ref
possible_keys: 
rc_namespace_title,rc_cur_id,new_name_timestamp
          key: new_name_timestamp
      key_len: 2
          ref: const,const
         rows: 16950
        Extra: Using where; Using filesort
*************************** 2. row 
***************************
        table: cur
         type: eq_ref
possible_keys: cur_id
          key: cur_id
      key_len: 4
          ref: recentchanges.rc_cur_id
         rows: 1
        Extra: Using where

The changed query had a run time of 0.01 seconds. Making the 
original query immediately after that (still with caching 
benefit from the fast form) took 5.91 seconds. I noticed the 
query when I saw this in mytop:

 7414082  wikiuser   ialrazi:36569     enwiki        10  
Query /* QueryPage::doFeed */ SELECT 'Newpages' as type, 
rc_namespace AS namespace, rc_title AS title, rc_cur_id AS 
val
Comment 1 Aaron Schulz 2008-09-16 11:24:24 UTC
Uses timestamp paging now (for a while)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links