Last modified: 2014-11-20 09:18:30 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 28397 - Allow collation to be specified per category
Allow collation to be specified per category
Status: PATCH_TO_REVIEW
Product: MediaWiki
Classification: Unclassified
Categories (Other open bugs)
unspecified
All All
: Normal enhancement with 15 votes (vote)
: ---
Assigned To: Liangent
:
Depends on: 35378 44667
Blocks: 30673 30996
  Show dependency treegraph
 
Reported: 2011-04-02 20:45 UTC by Bawolff (Brian Wolff)
Modified: 2014-11-20 09:18 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bawolff (Brian Wolff) 2011-04-02 20:45:25 UTC
It was suggested on irc by Skalman12 (as well as in some other places. Presumably bug 164, but I don't feel like reading the 5 billion comments on it) that it should be possible to override the collation. Specifically, the Wiktionaries would find it useful.

Its not entirely clear if this is a wontfix, but i think its possible.

As a further aside, This also won't be very useful until some future time when we actually have multiple useful collations.

Suggested ways of doing this that I've heard so far:
*Global config variable that provides specific categories that use different collation.

Downside: Thats something very specific to put in LocalSettings.php - It'd probably be a long list, it would also probably change on a regular enough interval that the number of shell requests would be annoying. A maintenance script would also have to be run each time this is done

*magic word/parser func - {{#usecollation:uca-german}} on a category page.
Upside: Wiki users can specify what collation easily. Could store the relevent info in page_props table. Could re-fresh the sorting of categorymembers with the job queue. Seems kind of the wiki-way of doing such thing, rather similar to how hidden categories work.

Downside: Well the categories are being re-sorted, the category page becomes kind of borked. Would be a good vandalism target to do this to [[category:Living people]]. Thus we probably want to limit collation changes to admins, so people don't abuse it.

*put config info in system message (mediawiki namespace page).
Upside: Wiki users can configure it them self. Limited to admins so people don't be stupid.

Downside: Not as easy to trigger the re-sort category jobs. Not as clear to the end user why category x is sorted differently from category y. And configuration in system messages is kind of evil.

*Use a special page. (Similar to how we do page protection). Could have a link in the toolbox for sufficiently privileged users to "Change category sorting".

Upside: Kind of a nicer UI. Could present a list of valid choices to user, instead of expecting them to know, along with help info about the various choices. Can have a separate right for changing collations.

Downside: Slightly more complex to implement, would require a new db table to manage the info. Also, to the average user wondering why this category sorts differently then others, its not as obvious as the parser func method, since nothing different in page source (although could have a notice similar to that of page protection perhaps, not sure if that'd entirely make sense).

---
I personally think the special page approach is the best way to do this (assuming that we do do that).

From a backend prespective, what would need to be done (I think anyways):

*Collation::singleton would have to be changed to accept an argument, for what category it is. Probably would need a change in name to to something more appropriate if its no longer a singleton.
*Collation would probably need a static method to map category names to collation name, so we can full out cl_collation field of the categorylinks table properly.
*Would need to implement support in the job queue to fix cl_sortkey field when we change it for a category. Probably not that hard since we have a maintenance script that does something close to that already. The relevant maintenace script expects everything to use the same collation name if i recall, so that'd also have to be changed.

Thoughts?
Comment 1 Aryeh Gregor (not reading bugmail, please e-mail directly) 2011-04-03 22:12:51 UTC
The system was designed so that this would be possible without schema changes, by the request of Wiktionary users, although the feature itself was left out of the initial version.  I'd say you want to have a magic word that only works on category pages and sets a page_props row.  To prevent DoS, very large categories can be protected or FlaggedRevs-ed, like very large templates are.  The steps you describe otherwise seem basically right.
Comment 2 Liangent 2012-03-21 03:15:09 UTC
I'm going to do it, after my multiple collation support is done. I plan to add a {{DEFAULTCOLLATION:}} magic word for category pages.
Comment 3 Liangent 2012-03-22 06:46:48 UTC
Change-Id: I2836aa4a63c146c2d40a0495a1fd58f0575196ff
Comment 4 CodeCat 2013-03-21 14:24:49 UTC
I'm reopening this because the above patch hasn't actually been implemented.

This is absolutely a must-have feature for multilingual projects like Wiktionary. Currently, we have to use rather hackish methods like sort keys to handle collation, but this method is flawed. It doesn't account for the different orders of letters in different languages (like ö in Swedish versus Turkish), nor does it handle languages where sequences of several characters are treated as distinct collation headings (like the digraphs of Hungarian).
Comment 5 Gerrit Notification Bot 2014-05-09 15:20:04 UTC
Change 132437 had a related patch set uploaded by Reedy:
(bug 28397) New magic word "{{DEFAULTCOLLATION:}}" to specify the default collation to use for a category

https://gerrit.wikimedia.org/r/132437
Comment 6 Gerrit Notification Bot 2014-05-09 15:20:43 UTC
Change 27526 restored by Reedy:
(bug 28397) New magic word "{{DEFAULTCOLLATION:}}" to specify the default collation to use for a category

https://gerrit.wikimedia.org/r/27526
Comment 7 Gerrit Notification Bot 2014-05-09 15:20:57 UTC
Change 132437 abandoned by Reedy:
(bug 28397) New magic word "{{DEFAULTCOLLATION:}}" to specify the default collation to use for a category

https://gerrit.wikimedia.org/r/132437
Comment 8 Gerrit Notification Bot 2014-05-09 21:49:31 UTC
Change 27526 had a related patch set uploaded by Bartosz Dziewoński:
New magic word "{{DEFAULTCOLLATION:}}" to specify the default collation to use for a category

https://gerrit.wikimedia.org/r/27526

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links