Last modified: 2009-06-15 20:33:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T21197, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 19197 - Capital letters are always sorted first
Capital letters are always sorted first
Status: RESOLVED DUPLICATE of bug 164
Product: MediaWiki
Classification: Unclassified
Categories (Other open bugs)
1.16.x
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-14 22:34 UTC by Drilnoth
Modified: 2009-06-15 20:33 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Drilnoth 2009-06-14 22:34:54 UTC
Currently, all uppercase letters are sorted in categories before all lowercase letters. For example, in http://en.wikipedia.org/wiki/Category:Bo-Bo_locomotives , the article "VR Class Sr2" is listed before "Victorian Railways E class (electric)". This is especially problematic in categories where abbreviations such as "SSX" or "NBA" are commonly used. Logically, uppercase letters should be sorted as being the same as lowercase letters. I understand that this is caused because category sorting uses Unicode ordering, but would it be possible to (essentially) say that "A = a", to have them sort correctly?

Current guidelines on this issue at http://en.wikipedia.org/wiki/Wikipedia:Categorization#Using_sort_keys would imply that most articles should have a DEFAULTSORT key in order to fix this, but there is resistance to having DEFAULTSORTs which really shouldn't be needed.
Comment 1 Alex Z. 2009-06-14 23:32:15 UTC
I believe the problem is that the sortkey is sorted as binary, so capital letters will come before lowercase letters. Sorting as utf-8 would fix it, but Wikimedia is still using MySQL 4 which I don't believe supports that. Other than upgrading to MySQL 5, this could be somewhat fixed by forcing sortkeys to lower case before saving them to the database, but that would possibly break other things.
Comment 2 Drilnoth 2009-06-15 01:28:40 UTC
Gotcha... I'm guessing that MySQL 5 would be way too big a jump at this point, right?
Comment 3 Chad H. 2009-06-15 20:24:12 UTC
Is this a dupe of something? Bug 164 comes to mind.
Comment 4 Happy-melon 2009-06-15 20:33:07 UTC
It's a "sort by something other than Unicode character point" bug, so yes, I'd say so.  

(In reply to comment #2)
> Gotcha... I'm guessing that MySQL 5 would be way too big a jump at this point,
> right?
> 

It's in the works.  It's been in the works for a while.  It will probably still be in the works for a while to come :D
Comment 5 Happy-melon 2009-06-15 20:33:20 UTC

*** This bug has been marked as a duplicate of bug 164 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links