Last modified: 2008-10-24 22:35:31 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T18076, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 16076 - Multiple ranges can be specified in backlinks query, also implicit equality propagation used
Multiple ranges can be specified in backlinks query, also implicit equality p...
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
1.14.x
All All
: Normal major (vote)
: ---
Assigned To: Roan Kattouw
http://p.defau.lt/?jn_bId8zq_f3pcb3ug...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-10-23 15:32 UTC by Domas Mituzas
Modified: 2008-10-24 22:35 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Domas Mituzas 2008-10-23 15:32:52 UTC
first of all, page_id is not in pagelinks table index, pl_from should be used as predicate (I assume people enjoyed 5.0 behavior here ;-)

also, if multiple plnamespace/pltitles are specified, should not be possible to limit pl_from/page_id as index reads are terminated by multiple-reads from plnamespace/pltitle level, and efficient range read from index is not possible.
Comment 1 Roan Kattouw 2008-10-23 15:48:09 UTC
Could you be a little more clear in your suggestions for improvement? From the first paragraph I deduced that page_id>=123 should be pl_from>=123, which I assume is what you mean by "implicit equality propagation" in the summary.

As to the second paragraph: does this mean that if multiple values for (pl_namespace, pl_title) are queried, the page_id>=123 (or pl_from>=123) clause shouldn't be there?
Comment 2 Roan Kattouw 2008-10-24 22:35:31 UTC
Fixed in r42494 and r42512.

For your entertainment: it turns out the filesort that happened when pl_from>=123 was set and multiple (pl_namespace, pl_title) pairs were queried actually pointed me to a more fundamental bug, which involved list=backlinks&blredirect dropping results under certain conditions. The lesson here is that because we have sane indices on the pagelinks table, non-indexed (and therefore inefficient) queries are usually buggy. I never expected this database performance hell to actually fix my bugs for me...

About the "certain conditions": assume B and C are redirects to A, D and E link to B and F links to C. Also, E's pageID is larger than D's, while F's is smaller than D's. If bllimit is set in such a way that the result is cut off after D (i.e. D is the last result), the continued query (with query-continue) will list E but not F, while F should be listed. I know this is complex like hell and it took me about 20 minutes to come up with this example. The moral of this story is that you should always build your queries around indices and never throw in unindexed stuff unless you really know what it does.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links