Last modified: 2009-02-04 23:21:39 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T19215, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 17215 - API: Fatal error with ucprop=patrolled
API: Fatal error with ucprop=patrolled
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Roan Kattouw
http://en.wikipedia.org/w/api.php?act...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-01-28 23:06 UTC by Gurch
Modified: 2009-02-04 23:21 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gurch 2009-01-28 23:06:18 UTC
http://en.wikipedia.org/w/api.php?action=query&list=usercontribs&ucuser=Gurch&ucprop=patrolled

gives

<api>
<error code="internal_api_error_DBQueryError" info="Database query error">

#0 /usr/local/apache/common-local/php-1.5/includes/db/Database.php(591): Database->reportQueryError('The SELECT woul...', 1104, 'SELECT  rev_tim...', 'ApiQueryContrib...', false)
#1 /usr/local/apache/common-local/php-1.5/includes/db/Database.php(1001): Database->query('SELECT  rev_tim...', 'ApiQueryContrib...')
#2 /usr/local/apache/common-local/php-1.5/includes/api/ApiQueryBase.php(222): Database->select(Array, Array, Array, 'ApiQueryContrib...', Array, Array)
#3 /usr/local/apache/common-local/php-1.5/includes/api/ApiQueryUserContributions.php(83): ApiQueryBase->select('ApiQueryContrib...')
#4 /usr/local/apache/common-local/php-1.5/includes/api/ApiQuery.php(213): ApiQueryContributions->execute()
#5 /usr/local/apache/common-local/php-1.5/includes/api/ApiMain.php(427): ApiQuery->execute()
#6 /usr/local/apache/common-local/php-1.5/includes/api/ApiMain.php(260): ApiMain->executeAction()
#7 /usr/local/apache/common-local/php-1.5/includes/api/ApiMain.php(244): ApiMain->executeActionWithErrorHandling()
#8 /usr/local/apache/common-local/php-1.5/api.php(77): ApiMain->execute()
#9 /usr/local/apache/common-local/live-1.5/api.php(3): require('/usr/local/apac...')
#10 {main}

</error>
</api>
Comment 1 Bryan Tong Minh 2009-01-28 23:21:24 UTC
The error 1104 is: ERROR 1104: The SELECT would examine more rows than MAX_JOIN_SIZE.

Obviously a join that is going wrong somewhere.
Comment 2 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-01-28 23:27:53 UTC
What's the query that would be run?
Comment 3 Roan Kattouw 2009-01-29 16:36:58 UTC
(In reply to comment #2)
> What's the query that would be run?
> 

SELECT /* ApiQueryContributions::execute Catrope */  rev_timestamp,page_namespace,page_title,rev_user_text,rc_patrolled
FROM `page`,`revision` FORCE INDEX (usertext_timestamp)
LEFT JOIN `recentchanges` ON ((rc_this_oldid=rev_id))
WHERE (page_id=rev_page)
AND rev_deleted = '0'
AND rev_user_text = 'Gurch'
ORDER BY rev_user_text DESC, rev_timestamp DESC
LIMIT 11

I personally don't see how this produces huge joins, as we're joining on unique indices, but then I've been known to be completely wrong about DB-related stuff before :D
Comment 4 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-01-29 16:50:03 UTC
This is the explain I get on the toolserver:

mysql> EXPLAIN SELECT   rev_timestamp,page_namespace,page_title,rev_user_text,rc_patrolled FROM `page`,`revision` LEFT JOIN `recentchanges` ON ((rc_this_oldid=rev_id)) WHERE (page_id=rev_page) AND rev_deleted = '0' AND rev_user_text = 'Gurch' ORDER BY rev_user_text DESC, rev_timestamp DESC LIMIT 11;
+----+-------------+---------------+--------+-------------------------------------------+--------------------+---------+--------------------------+---------+----------------------------------------------+
| id | select_type | table         | type   | possible_keys                             | key                | key_len | ref                      | rows    | Extra                                        |
+----+-------------+---------------+--------+-------------------------------------------+--------------------+---------+--------------------------+---------+----------------------------------------------+
|  1 | SIMPLE      | revision      | ref    | PRIMARY,page_timestamp,usertext_timestamp | usertext_timestamp | 257     | const                    |  187372 | Using where; Using temporary; Using filesort | 
|  1 | SIMPLE      | recentchanges | ALL    | NULL                                      | NULL               | NULL    | NULL                     | 6666524 |                                              | 
|  1 | SIMPLE      | page          | eq_ref | PRIMARY                                   | PRIMARY            | 4       | enwiki.revision.rev_page |       1 |                                              | 
+----+-------------+---------------+--------+-------------------------------------------+--------------------+---------+--------------------------+---------+----------------------------------------------+
3 rows in set (0.01 sec)

Note join type ALL, key NULL.  I only have views on the toolserver, not the actual tables, so I can't try forcing rc_patrolling as the index to use for recentchanges there (nor can I use the FORCE INDEX that was already in the query).  When I run it on localhost, it chooses rc_patrolling for the recentchanges join.
Comment 5 Roan Kattouw 2009-01-29 17:33:00 UTC
(In reply to comment #4)
> Note join type ALL, key NULL.  I only have views on the toolserver, not the
> actual tables, so I can't try forcing rc_patrolling as the index to use for
> recentchanges there (nor can I use the FORCE INDEX that was already in the
> query).
Why not?

> When I run it on localhost, it chooses rc_patrolling for the
> recentchanges join.
> 

For me too, except that on one of my two computers, I don't *have* an rc_patrolling index, while on the other one I do. The index also isn't listed in tables.sql, and there's no updater patch that adds it. Where did my index go? :O :D

Back on topic: maybe adding FORCE INDEX(rc_patrolling) will fix this, provided that index is present on the WMF DB servers. You also seem to indicate that the effect of this FORCE INDEX can't be tested on the toolserver for some reason.
Comment 6 Roan Kattouw 2009-01-29 17:40:56 UTC
(In reply to comment #5)
> The index also isn't listed
> in tables.sql, and there's no updater patch that adds it. Where did my index
> go? :O :D
> 
Deeper research (thanks to TortoiseSVN's search facilities) turned up r25527 in which Domas removed the rc_patrolling index (added by Greg in r24699) for no apparent reason and without adding a remover in updaters.inc .
Comment 7 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-01-29 17:44:23 UTC
(In reply to comment #5)
> Why not?

FORCE INDEX doesn't work on views, and non-root toolserver users only have access to views.

> For me too, except that on one of my two computers, I don't *have* an
> rc_patrolling index, while on the other one I do. The index also isn't listed
> in tables.sql, and there's no updater patch that adds it. Where did my index
> go? :O :D

Um, yeah, that would explain it.  See r25527.  The index doesn't exist.  I've dropped it from localhost so I don't get confused in the future.  I don't see any good way to get pages' patrolled status, as long as it's only in RC -- why isn't it in revision as well, again?  Grr.

(simulpost)

(In reply to comment #6)
> Deeper research (thanks to TortoiseSVN's search facilities) turned up r25527 in
> which Domas removed the rc_patrolling index (added by Greg in r24699) for no
> apparent reason and without adding a remover in updaters.inc .

The reason was almost certainly just because having unneeded indexes is slow.  Is there any compelling justification for keeping this?  What actually would need to use it other than some API queries?
Comment 8 Roan Kattouw 2009-01-29 17:52:25 UTC
(In reply to comment #7)
> The reason was almost certainly just because having unneeded indexes is slow. 
> Is there any compelling justification for keeping this?  What actually would
> need to use it other than some API queries?
> 

It makes it possible to find the recentchanges entries for a set of revisions/revids (either with a join on the revision table or with a WHERE clause), which is generic enough to warrant an index IMO, even if it's just an index on rc_this_oldid.
Comment 9 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-01-29 17:54:35 UTC
That's not an access pattern RC is supposed to support, though.  It's really just supposed to summarize other tables.  If you have the revision id's already, all relevant info should be in other tables like revision and page.  The problem is, for some reason rc_patrolled is not.
Comment 10 Roan Kattouw 2009-01-29 17:55:37 UTC
(In reply to comment #9)
> That's not an access pattern RC is supposed to support, though.  It's really
> just supposed to summarize other tables.  If you have the revision id's
> already, all relevant info should be in other tables like revision and page. 
> The problem is, for some reason rc_patrolled is not.
> 

Exactly, same story with the bot flag. So we either need to move/copy the patrolled flag elsewhere, or add that index.
Comment 11 Gurch 2009-02-04 13:03:40 UTC
(In reply to comment #10)
> Exactly, same story with the bot flag. So we either need to move/copy the
> patrolled flag elsewhere, or add that index.

In the meantime would it be wise to remove 'patrolled' as an option for ucprop and/or handle this error more gracefully, since it's unlikely to start working again by iteslf?
Comment 12 Roan Kattouw 2009-02-04 13:16:41 UTC
(In reply to comment #11)
> In the meantime would it be wise to remove 'patrolled' as an option for ucprop
> and/or handle this error more gracefully, since it's unlikely to start working
> again by iteslf?
> 

Temporarily disabled in r46795
Comment 13 Roan Kattouw 2009-02-04 23:21:39 UTC
Re-enabled in r46825 with a smart JOIN on the timestamp first (indexed) and the revid second (not indexed, but usually a no-op or a very small set since two revs with the same timestamp by the same user are pretty rare). Thanks to Tim for the tip.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links