Last modified: 2013-06-18 16:22:13 UTC
Set $wgCategoryCollation to 'uca-sv' on Swedish Wikipedia and rebuild category sort keys Needs community notification and discussion. Split off from bug 29788.
Assuming this change will sort ABC...XYZÅÄÖ correctly there have been multiple community discussions on Swedish Wikipedia agreeing that the old sort order was a bug needing fixing. We are just waiting for the bug to be fixed (first bug 164, then bug 29788, and now this bug). Sorting diacritics (other than ÅÄÖ) better (ÁÀÂ... as variants of A, ÇČ.. as variants of C, and so on) is a bonus, but already worked around in many cases with DEFAULTSORT. Unless there are very strange changes in sorting of other characters there should be no opposition to changing this setting.
I created a testwiki for you at http://users.v-lo.krakow.pl/~matmarex/testwiki-sv/ to verify that the behavior is indeed correct. Feel free to use it for testing and link in on-wiki, but be aware that I'll probably kill it once the change is performed. Please link some of the discussions (preferably ones with results clearly indicated with yes/no icons :) ) so I can suggest the change with a clear conscience ;) This will have to wait at least for the deployment of 1.21wmf11 anyway: https://www.mediawiki.org/wiki/MediaWiki_1.21/Roadmap And yes, that's exactly what this change will do.
And if there are no such discussions, it would be nice to hold one, even if it's just a formality. I am not a WMF employee, but their policy is clear – a configuration change (especially one that is this disruptive) can only be made if there's obvious consensus. There's no hurry, especially since this change can only be made after MW 1.21wmf11 is deployed on March 13. Here's a very similar voting/discussion I created on pl.wikipedia, regarding the same change, but for Polish: short explanation, voting and comment with yes/no icons. You can link the testwiki I created there. https://pl.wikipedia.org/wiki/Wikipedia:PR#Zmiana_konfiguracji_.E2.80.93_w.C5.82.C4.85czenie_poprawnego_sortowania_artyku.C5.82.C3.B3w_na_stronach_kategorii
The sorting looks good at the test wiki. Thanks for making that available. The discussions on Swedish Wikipedia are mostly someone asking "Why are Å and Ä in wrong order?" and someone else answering "It's a bug" then maybe followed by discussing if anything can or is being done to fix it. Fixing an obvious bug is not the kind of discussions that would get long lists of supporters (also Swedish Wikipedia generally avoids votes with icons). One such discussion is [[sv:Wikipedia:Wikipediafrågor/Arkiv/2011#ABC...ÄÅÖ]], which ended with submitting bug 29788 "Sort Swedish letters ÅÄÖ correctly on Swedish Wikipedia". How the servers are set up technically to achieve this is better decided by Wikimedia technical staff than by a Wikipedia user vote. But if a vote is needed I am sure it can be arranged.
I think this can proceed without another voting, the community has made it pretty clear they do want it :) Ib357adba.
Community was notified and agrees to this change at the local Village Pump: [[sv:Wikipedia:Bybrunnen#Svensk_sorteringsordning_i_kategorier_.28.C3.85.C3.84.C3.96.2C_inte_.C3.84.C3.85.C3.96.29]] Unfortunately there is a problem with words starting between "Th" and "Tö". They are sorted in the right order. But they get sorted under letter "Þ", and not under letter "T" like words between "T" and "Tg". I think sorting letter "Þ" as "th" is fine, but then it should go under a letter "T" heading.
This would probably not happen if bug 43740 was fixed. (Thorn is expanded to "th", which ideally would get removed during the prune primary collision step but doesnt) In the mean time should probably have a system for remmoving certain elements from the big list of first-letters for certain collations (opposite of the current $firstLetters array that adds elements to the big list)
Submitted I57e07a20 to fix this. Deployed on my testwiki, where it seems to solve the issue. Once it's merged and deployed on sv.wiki, the following has to be done to fix the categories: * remove the entry for first-letters from the object cache * re-run updateCollation.php with a --force argument
Only purging first-letters:sv (or i suppose the full key would be svwiki:first-letters:sv) from memcache after merging this change is neccesary. Re running the script should not be needed.
mysql:wikiadmin@db1034 [svwiki]> select count(cl_collation), cl_collation from categorylinks group by cl_collation ; +---------------------+--------------+ | count(cl_collation) | cl_collation | +---------------------+--------------+ | 4556745 | uca-sv | +---------------------+--------------+ 1 row in set (11.04 sec)
(In reply to comment #8) > Once it's merged and deployed on sv.wiki, the following has to be done to fix > the categories: > * remove the entry for first-letters from the object cache > * re-run updateCollation.php with a --force argument For the latter, I think we might want to hold off (if possible).. Tim is/was going to do some server side ICU upgrades, which will then require for all the wikis on uca-* to be re-run with --force
Reedy tried to do some cache purging on IRC today and failed. I have no idea what is going on, and it seems neither has he. :) Worst case, we'll just have to wait a week for the cache to expire and hopefully it'll start working properly by itself. Sorry about the mess. Example category page with thorn ('Þ') visible: https://sv.wikipedia.org/wiki/Kategori:Svenska_kokboksförfattare?action=purge
Maybe CACHE_ANYTHING goes to a different cache then was being purge(?)
So it seems like we finally managed to purge the right servers. I'm making this as RESOLVED FIXED. (If any category pages are still looking funny, they just need action=purge.)