Last modified: 2012-04-16 09:15:47 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T30020, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 28020 - Category sort key (cl_sortkey_prefix) set to wrong value when a page is moved and stays that way
Category sort key (cl_sortkey_prefix) set to wrong value when a page is moved...
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Categories (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: 27339
  Show dependency treegraph
 
Reported: 2011-03-13 05:47 UTC by Gustronico
Modified: 2012-04-16 09:15 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Gustronico 2011-03-13 05:47:07 UTC
Please excuse me for my bad English. Feel free to re-write this entry.
In eswiki we have a maintenance category that uses timestamps as sortkeys, articles with older timestamps appearing first. It was working ok until a few days ago when some changes were made (see this comment by Bawolff [https://bugzilla.wikimedia.org/show_bug.cgi?id=4912#c19]). After that, we experienced two different issues:

1) Articles that have been moved during the three last days, are completely mis-sorted, appearing at the very end of the category under their initial letters. It looks like their timestamp sortkeys have been lost, and only their pagenames are taken.

2) This issue can be easily fixed, nevertheless I'll describe it fyi:
Articles with 8-digit timestamps YYYYMMDD (usually manually tagged by users) and those with full-sized 14-digit timestamps YYYYMMDDHHMMSS (usually bot inputs) are no longer correctly sorted each other. In addition, we created some "label" pages to mark months and years boundaries, keyed with 6-digit and 4-digit timestamps. Right now, these labels appear at the end of their periods, rather than at the beginning. I plan to fix this issue padrighting all timestamps with zeros up to 14 digit.

3) Yes, a third issue appeared: Pages with pure-numeric pagenames are also mis-sorted. Our "label" pages assigned to mark years are titled Wikipedia:2007, Wikipedia:2008 and so, and are manually sortkeyed with 2007, 2008 and so. It seems these pages are experiencing another kind of sort error, appearing all together after all timestamped pages but before moved ones. The issue disappears adding a letter (not a number) to the sortkey, ie 2007x.
Comment 1 Bawolff (Brian Wolff) 2011-03-13 05:50:04 UTC
Which category is this?
Comment 3 Bawolff (Brian Wolff) 2011-03-13 06:52:34 UTC
Hmm, I'm fairly confused whats going on. For some reason its not recognizing the custom sortkey [[Categoría:Wikipedia:Wikificar| 20110308]] correctly ( http://es.wikipedia.org/w/api.php?action=query&titles=H%C3%A9ctor%20Palomares%20Medina&prop=categories&clprop=sortkey seems to indicate a {{defaultsortkey:}} is overriding it, but defaultsortkey shouldn't override a per category specified sortkey, and the other examples in http://es.wikipedia.org/w/api.php?action=query&list=categorymembers&cmtitle=category:Wikipedia:Wikificar&cmsort=sortkey&cmprop=sortkey&cmstartsortkey=%203 aren't being overridden by a {{DEFAULTSORT}})

However, the parser seems to extract it fine ( http://es.wikipedia.org/w/api.php?action=parse&page=H%C3%A9ctor%20Palomares%20Medina&prop=categories ), and null edits/purges don't fix it.

So for some reason its just throwing out the custom sortkeys for no apparent reason whatsoever.
Comment 4 Gustronico 2011-03-13 07:51:48 UTC
Héctor Palomares Medina has a forced {{DEFAULTSORT}} included in {{BD|1958||Palomares Medina, Hector}}, but the other examples do not have a {{DEFAULTSORT}} at all. What *all* of of them do have in common is that have been recently moved. While I was seeing this issue, a page suddenly appeared in the lettered sections at the end of the category. Then I noticed it has been just moved.
Comment 5 Gustronico 2011-03-13 21:48:21 UTC
I`ve just noticed that this issue is present in all recently moved articles, in all categories, not just in timestamp-sortkeyed ones. Actually, pagename (or {{DEFAULTSORT}} if it exists) is overriding all custom sortkeys in all recently moved pages. And not only null edits and purges don't fix it, also major editions do nothing, only clearing/changing a categorization tag itself, then saving, then re-inserting it, restores the custom sortkey.
Comment 6 Bawolff (Brian Wolff) 2011-03-13 23:39:56 UTC
Just to clarify this is every article that gets moved, not just some of them.

Will creating an article, with [[category:Foo| 20100503]] on it, then moving that article to a new title always cause the issue to be observed, or just sometimes?
Comment 7 Bawolff (Brian Wolff) 2011-03-14 00:41:19 UTC
I think it has something to do with the category sortkeys get associated with the prefix of the category name that comes before the space(?!) I am able to reproduce this on trunk using the following steps:

Steps to reproduce (on trunk):
Template cat containing:
[[category:Published byme]]
[[category:Published| {{{1}}}]]

Page foo containing:
{{cat|210}}

After that, move foo to a new title,

page foo is now listed both in category:Published and Category:Published_byme with the sortkey " 210". Expected behaviour is for it to only be listed in category:Published with that sortkey. Null edits do not fix it, but adding and removing the template does. Adding another category after the fact also does not fix the sortkeys on the older categories
Comment 8 Bawolff (Brian Wolff) 2011-03-14 01:52:13 UTC
Ok I think I found the issue:
*Moving a page sets all cl_sortkey_prefix on all categorylinks to whatever the value was on the first one it got out of the db.
*When doing linksupdate, we only check cl_sortkey_prefix, we don't check cl_sortkey. If these get out of sync, such that cl_sortkey_prefix is right, but cl_sortkey is not, this never gets corrected short of adding and removing the category.
Comment 9 Bawolff (Brian Wolff) 2011-03-14 02:22:40 UTC
fixed r83866. (This won't fix pages this already happened to, will only stop it from happening again. To fix pages that its already happened to, you have to manually remove and re-add the category (or template containing the category). For clarification purpose, fixed in r83866 means fixed in the code repo, it might take a little while before the fix is deployed to Wikimedia, but since its a 1.17 regression, probably not too much time).
Comment 10 Bawolff (Brian Wolff) 2011-03-14 02:45:47 UTC
Sorry, forgot you have multiple issues here:

For issue 2 - They appear to be caused by adding the X to the end of the sortkey. In my testing, a page named project:2010, with the sortkey " 2010" will sort at the beginning of the things with a sortkey starting with " 2010...". Adding an X will make it sort at the end of the 2010 section since X comes after all numbers in the alphabet (Same for the month pages).

For issue #3 - I can't reproduce. [[Project:2010]] with
[[category:catname| 2010]] on it sorts in the expected position.
Comment 11 Gustronico 2011-03-14 03:20:21 UTC
Good job Bawolff! I'm not a programmer and comment 8 is too complicated for me to understand, but think it will be deployed soon. I've also found this issue in articles with "normal" non-numerical sortkeys, and without leading spaces.

In my issue #2 there are no x at all. I've only put one x as a test in a unique label page. This miss-sort is present between pages tagged manually (8 digits) tagged by bots (14 digits) and label pages with 4 and 6 digits. Nevertheless I'll study it better way with those api.php queries you show me in comment 3. I didn't know them. 

Did you remember this?
http://es.wikipedia.org/w/index.php?title=Usuario_Discusi%C3%B3n:Gustronico&diff=prev&oldid=31805957

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links