Last modified: 2007-12-16 21:14:51 UTC
When API enumerates through the list of pages in a given category, it needs a way to resume the query. Sortkey provides a good point from which to continue, but has one drawback - more than one page may have identical one, which may lead to the following bug scenario:
Assuming there are 20 pages in a category, and page 10 and 11 both have identical sortkey, the user's query may request go 10 pages at a time. The sortkey to continue from would be the value of #11, but since it is the same as #10, #10 will be returned twice - in both the first and second resultset. This might even result in an infinite loop - requesting one item at a time would reach #10 and never advance to #11.
Solution: sort by sortkey + cl_from, and store both the sortkey and cl_from as the starting point.
To optimize query execution, cl_sortkey needs to be modified by adding cl_from at the end:
ALTER TABLE `wikidb`.`categorylinks` DROP INDEX `cl_sortkey`,
ADD INDEX `cl_sortkey` USING BTREE(`cl_to`, `cl_sortkey`, `cl_from`)
, ENGINE = MyISAM;
Checked in r23016 schema update. Pending servers update.
Another check-in r23228 - sql table scripts.
This seems to contribute to the common http://bugzilla.wikimedia.org/show_bug.cgi?id=4445 bug (index key too long). See my comments there. I've "reopened" this bug but apologise if this was not the correct course of action. Best wishes.
Not relevant, that's a separate issue. Re-resolving.