Last modified: 2014-11-17 09:46:08 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 45611 - Make the "uca-xx" category collations the default? (selectable directly in the installer?)
Make the "uca-xx" category collations the default? (selectable directly in th...
Status: NEW
Product: MediaWiki
Classification: Unclassified
Categories (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: design, i18n
Depends on:
Blocks: 30673
  Show dependency treegraph
 
Reported: 2013-03-01 18:10 UTC by Bartosz Dziewoński
Modified: 2014-11-17 09:46 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bartosz Dziewoński 2013-03-01 18:10:42 UTC
The "uca-xx" category collations should probably be default ones (for the languages that have them available), or should be selectable in the installer.

I'm really not sure what would be the best way to do this, or if it's even a good idea. It probably is, though.
Comment 1 Bartosz Dziewoński 2013-03-01 18:13:48 UTC
CC-ing everyone who filed/commented on the bugs about deploying uca-xx collations on Wikimedia wikis. Any input, guys/gals? :)
Comment 2 Bawolff (Brian Wolff) 2013-03-01 19:30:24 UTC
I would go with selectable in the installer (maybe even being the default). However there are 2 issues I would like to see fixed before doing that

# the issue with prefix collisions in the first-letters-root.ser file (there is a bug for this)
# I think we should probably add a cl_collation_version field to categorylinks table. If someone upgrades php the uca version changes and everything breaks. At the very least update.php should fix this. Atm one needs to run updateCollation.php --force. The force should not be neccesary (plus  a null edit to a page should fix category links that are broken in such a fashion)

Note: we probably don't want this to be tied to the wikis content lang as that may break on upgrade from older mw (otoh update.php would fix this) and people probably don't expect that changing the lang code would cause such a disruptive change.
Comment 3 Bartosz Dziewoński 2013-03-01 19:36:33 UTC
(In reply to comment #2)
> # the issue with prefix collisions in the first-letters-root.ser file (there
> is a bug for this)

Bug 43740.


(In reply to comment #2)
> # I think we should probably add a cl_collation_version field to
> categorylinks
> table. If someone upgrades php the uca version changes and everything breaks.
> At the very least update.php should fix this. Atm one needs to run
> updateCollation.php --force. The force should not be neccesary (plus  a null
> edit to a page should fix category links that are broken in such a fashion)

Good point. I wonder how this related to the 'chinese-collation' branch and Liangent's support for using multiple collations at once? (bug 44667)


> Note: we probably don't want this to be tied to the wikis content lang as
> that
> may break on upgrade from older mw (otoh update.php would fix this) and
> people
> probably don't expect that changing the lang code would cause such a
> disruptive
> change.

As above, this might actually sort of "just work" if we support using multiple collations at once. If we can't get it to, though, then yes, it can't depend on $wgLanguageCode.
Comment 4 Anatoliy Goncharov 2013-03-01 20:22:13 UTC
Well, but what languages do need to change collation? Does it need for English or Russian? And do we have collations for all languages?
Comment 5 Bartosz Dziewoński 2013-03-01 20:31:25 UTC
(In reply to comment #4)
> Well, but what languages do need to change collation? Does it need for
> English or Russian?

Sort of. While the "native" letters for both of these sort correctly by default (with the 'uppercase' collation), their accented variants are placed at the very end of category page listings, which might be undesirable.

I know that the English Wikipedia uses {{DEFAULTSORT: hacks to enforce behavior similar to what the UCA collations do (by sorting by the article title with all accents removed); I don't know what is done in other languages.


> And do we have collations for all languages?

No (not yet :) ). 67 languages are supported now (including I think all major European ones, there's a list at the bottom of [[mw:Manual:$wgCategoryCollation]]; more could be added if only someone did this), and Liangent is working on collations for Chinese.
Comment 6 Liangent 2013-03-02 04:17:38 UTC
(In reply to comment #3)
> (In reply to comment #2)
> > # I think we should probably add a cl_collation_version field to
> > categorylinks
> > table. If someone upgrades php the uca version changes and everything breaks.
> > At the very least update.php should fix this. Atm one needs to run
> > updateCollation.php --force. The force should not be neccesary (plus  a null
> > edit to a page should fix category links that are broken in such a fashion)
> 
> Good point. I wonder how this related to the 'chinese-collation' branch and
> Liangent's support for using multiple collations at once? (bug 44667)

So if cl_collation_version looks outdated, we update sortkey automatically for that entry? I guess updating sortkeys partially (only for entries we're reading) breaks category pages more before sysadmins run updateCollation.php to update them fully.

(In reply to comment #5)
> (In reply to comment #4)
> > And do we have collations for all languages?
> 
> No (not yet :) ). 67 languages are supported now (including I think all major
> European ones, there's a list at the bottom of
> [[mw:Manual:$wgCategoryCollation]]; more could be added if only someone did
> this), and Liangent is working on collations for Chinese.

It's (almost, except for what can't be done easily now due to external dependency) done and pending review.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links