Last modified: 2013-06-18 16:22:13 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T47446, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 45446 - Set $wgCategoryCollation to 'uca-sv' on Swedish Wikipedia and rebuild category sort keys
Set $wgCategoryCollation to 'uca-sv' on Swedish Wikipedia and rebuild categor...
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
unspecified
All All
: Unprioritized enhancement (vote)
: ---
Assigned To: Bartosz Dziewoński
: shell
Depends on:
Blocks: collations
  Show dependency treegraph
 
Reported: 2013-02-26 20:29 UTC by Bartosz Dziewoński
Modified: 2013-06-18 16:22 UTC (History)
14 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bartosz Dziewoński 2013-02-26 20:29:43 UTC
Set $wgCategoryCollation to 'uca-sv' on Swedish Wikipedia and rebuild category sort keys

Needs community notification and discussion.

Split off from bug 29788.
Comment 1 Lejonel 2013-03-01 17:40:37 UTC
Assuming this change will sort ABC...XYZÅÄÖ correctly there have been multiple community discussions on Swedish Wikipedia agreeing that the old sort order was a bug needing fixing. We are just waiting for the bug to be fixed (first bug 164, then bug 29788, and now this bug). Sorting diacritics (other than ÅÄÖ) better (ÁÀÂ... as variants of A, ÇČ.. as variants of C, and so on) is a bonus, but already worked around in many cases with DEFAULTSORT. Unless there are very strange changes in sorting of other characters there should be no opposition to changing this setting.
Comment 2 Bartosz Dziewoński 2013-03-01 18:01:18 UTC
I created a testwiki for you at http://users.v-lo.krakow.pl/~matmarex/testwiki-sv/ to verify that the behavior is indeed correct. Feel free to use it for testing and link in on-wiki, but be aware that I'll probably kill it once the change is performed.

Please link some of the discussions (preferably ones with results clearly indicated with yes/no icons :) ) so I can suggest the change with a clear conscience ;) This will have to wait at least for the deployment of 1.21wmf11 anyway: https://www.mediawiki.org/wiki/MediaWiki_1.21/Roadmap

And yes, that's exactly what this change will do.
Comment 3 Bartosz Dziewoński 2013-03-02 15:14:45 UTC
And if there are no such discussions, it would be nice to hold one, even if it's just a formality. I am not a WMF employee, but their policy is clear – a configuration change (especially one that is this disruptive) can only be made if there's obvious consensus.

There's no hurry, especially since this change can only be made after MW 1.21wmf11 is deployed on March 13.

Here's a very similar voting/discussion I created on pl.wikipedia, regarding the same change, but for Polish: short explanation, voting and comment with yes/no icons. You can link the testwiki I created there.

https://pl.wikipedia.org/wiki/Wikipedia:PR#Zmiana_konfiguracji_.E2.80.93_w.C5.82.C4.85czenie_poprawnego_sortowania_artyku.C5.82.C3.B3w_na_stronach_kategorii
Comment 4 Lejonel 2013-03-02 19:55:22 UTC
The sorting looks good at the test wiki. Thanks for making that available.

The discussions on Swedish Wikipedia are mostly someone asking "Why are Å and Ä in wrong order?" and someone else answering "It's a bug" then maybe followed by discussing if anything can or is being done to fix it.  Fixing an obvious bug is not the kind of discussions that would get long lists of supporters (also Swedish Wikipedia generally avoids votes with icons).  One such discussion is [[sv:Wikipedia:Wikipediafrågor/Arkiv/2011#ABC...ÄÅÖ]], which ended with submitting bug 29788 "Sort Swedish letters ÅÄÖ correctly on Swedish Wikipedia".  How the servers are set up technically to achieve this is better decided by Wikimedia technical staff than by a Wikipedia user vote.  But if a vote is needed I am sure it can be arranged.
Comment 5 Bartosz Dziewoński 2013-03-08 19:56:52 UTC
I think this can proceed without another voting, the community has made it pretty clear they do want it :) Ib357adba.
Comment 6 Lejonel 2013-03-11 17:29:12 UTC
Community was notified and agrees to this change at the local Village Pump: [[sv:Wikipedia:Bybrunnen#Svensk_sorteringsordning_i_kategorier_.28.C3.85.C3.84.C3.96.2C_inte_.C3.84.C3.85.C3.96.29]]

Unfortunately there is a problem with words starting between "Th" and "Tö". They are sorted in the right order. But they get sorted under letter "Þ", and not under letter "T" like words between "T" and "Tg".

I think sorting letter "Þ" as "th" is fine, but then it should go under a letter "T" heading.
Comment 7 Bawolff (Brian Wolff) 2013-03-11 17:52:53 UTC
This would probably not happen if bug 43740 was fixed. (Thorn is expanded to "th", which ideally would get removed during the prune primary collision step but doesnt)

In the mean time should probably have a system for remmoving certain elements from the big list of first-letters for certain collations (opposite of the current $firstLetters array that adds elements to the big list)
Comment 8 Bartosz Dziewoński 2013-03-11 21:31:31 UTC
Submitted I57e07a20 to fix this. Deployed on my testwiki, where it seems to solve the issue.

Once it's merged and deployed on sv.wiki, the following has to be done to fix the categories:
* remove the entry for first-letters from the object cache
* re-run updateCollation.php with a --force argument
Comment 9 Bawolff (Brian Wolff) 2013-03-11 22:05:26 UTC
Only purging first-letters:sv (or i suppose the full key would be svwiki:first-letters:sv) from memcache after merging this change is neccesary. Re running the script should not be needed.
Comment 10 Sam Reed (reedy) 2013-03-12 17:23:59 UTC
mysql:wikiadmin@db1034 [svwiki]> select count(cl_collation), cl_collation from categorylinks group by cl_collation ;
+---------------------+--------------+
| count(cl_collation) | cl_collation |
+---------------------+--------------+
|             4556745 | uca-sv       |
+---------------------+--------------+
1 row in set (11.04 sec)
Comment 11 Sam Reed (reedy) 2013-03-12 17:26:09 UTC
(In reply to comment #8)
> Once it's merged and deployed on sv.wiki, the following has to be done to fix
> the categories:
> * remove the entry for first-letters from the object cache
> * re-run updateCollation.php with a --force argument


For the latter, I think we might want to hold off (if possible).. Tim is/was going to do some server side ICU upgrades, which will then require for all the wikis on uca-* to be re-run with --force
Comment 12 Bartosz Dziewoński 2013-03-12 20:21:29 UTC
Reedy tried to do some cache purging on IRC today and failed. I have no idea what is going on, and it seems neither has he. :)

Worst case, we'll just have to wait a week for the cache to expire and hopefully it'll start working properly by itself. Sorry about the mess.

Example category page with thorn ('Þ') visible: https://sv.wikipedia.org/wiki/Kategori:Svenska_kokboksförfattare?action=purge
Comment 13 Bawolff (Brian Wolff) 2013-03-12 20:30:21 UTC
Maybe CACHE_ANYTHING goes to a different cache then was being purge(?)
Comment 14 Bartosz Dziewoński 2013-03-12 22:50:21 UTC
So it seems like we finally managed to purge the right servers. I'm making this as RESOLVED FIXED.

(If any category pages are still looking funny, they just need action=purge.)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links