Last modified: 2014-10-11 11:58:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T46667, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 44667 - Review, merge and deploy chinese-collation branch
Review, merge and deploy chinese-collation branch
Status: NEW
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Liangent
:
Depends on:
Blocks: 28397 30672 30673 30996 35378 69417 70819
  Show dependency treegraph
 
Reported: 2013-02-05 06:28 UTC by Tim Starling
Modified: 2014-10-11 11:58 UTC (History)
16 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Tim Starling 2013-02-05 06:28:46 UTC
Chinese collation is complex, not least because different Chinese-speaking regions have different customary collations. The KangXi order favoured by Unicode standards is rarely used in any region, except in dictionaries.

Liangent has prepared a core branch which allows multiple category collations to coexist on a single wiki, with selectable sort order on category pages. I helped develop the architecture.

One of the collations considered essential for Chinese wikis is a latin sort of the pinyin transliteration. We will need to upgrade to ICU 4.8 to support this collation.

This bug tracks the tasks needed for merge of the chinese-collation branch and the deployment of a suitable multi-collation configuration on the Chinese Wikipedia.
Comment 1 Liangent 2013-02-09 12:42:43 UTC
I94056ca2
Comment 2 Rob Lanphier 2013-04-15 21:52:59 UTC
Tagging with "design" keyword, since there's a small dropdown that might use a little love.
Comment 3 Greg Grossmeier 2013-06-14 19:41:10 UTC
According to Jared Zimmerman this morning, Pau did a design review of this recently and everything seemed ok (or, was going to be ok, or similar). Is that right, Liangent?
Comment 4 Liangent 2013-06-15 05:37:12 UTC
(In reply to comment #3)
> According to Jared Zimmerman this morning, Pau did a design review of this
> recently and everything seemed ok (or, was going to be ok, or similar). Is
> that
> right, Liangent?

Right
Comment 5 Greg Grossmeier 2013-06-15 23:39:40 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > According to Jared Zimmerman this morning, Pau did a design review of this
> > recently and everything seemed ok (or, was going to be ok, or similar). Is
> > that
> > right, Liangent?
> 
> Right

Great, thanks for confirming, Liangent.

I'd like this bug to have the needed next steps in it; could you tell me what you think they are from your end, Liangent (and anyone else who sees this bug mail).

Would love to know what needs to be prioritized.
Comment 6 Sam Reed (reedy) 2013-06-27 21:53:57 UTC
We have libicu48 version 4.8.1.1-3, so presumably nothing else needs doing to that extent.

(In reply to comment #5)
> Would love to know what needs to be prioritized.

The merge commit needs the various merge conflicts fixing, and also rebasing again onto core, as it's at least 10 weeks old, if not 18 weeks or so
Comment 7 Greg Grossmeier 2013-06-27 23:20:49 UTC
(In reply to comment #6)
> We have libicu48 version 4.8.1.1-3, so presumably nothing else needs doing to
> that extent.
> 
> (In reply to comment #5)
> > Would love to know what needs to be prioritized.
> 
> The merge commit needs the various merge conflicts fixing, and also rebasing
> again onto core, as it's at least 10 weeks old, if not 18 weeks or so

Alright, then I'm assuming Pau is no longer actively working on this, so assigning to Liangent but that's only because they own the merge. Would love someone else on this CC: list to take a look at that merge.
Comment 8 Bawolff (Brian Wolff) 2013-07-11 16:35:44 UTC
I took a quick look through the code (I was using it to make a prototype of a feature idea: http://tools.wmflabs.org/bawolff/whichisbetter ). It works well. A couple things I noticed though (Note I did not read the code in depth):

*In Title::moveTo, the code seems to assume the cl_sortkey_prefix is the same for all collations. I do not think this is the case.
*When running update.php, the script runs updateCollation.php before doing the schema changes from your code, instead of after. (Really I think it should run it after all extension schema change, in case someone abuses the Collation framework, to make a collation that depends on a schema change). Arguably this issue was here before your code.
*From a language perspective, I think using the phrase "Sorting method" instead of "collation" for the message 'category-collation' would be better and less jargony. Some of the collation names ( 'Identity' ) are a bit jargony as well, but I guess that can't really be helped. We can't exactly use the word 'alphabetical', since they're all alphabetical.
*On category pages, <label for="mw-collation-select">Sorting method:</label> should have an id or class attribute so people could style it easily. Additionally I think it might look better with the css vertical-align: bottom.

I'd submit gerrit patches for some of these, but I'm kind of unclear how to do that/should I do that given I don't really understand how long-term feature branches in gerrit are supposed to work. Should I just submit new patches to the chinese-collation branch?
Comment 9 Bawolff (Brian Wolff) 2013-07-11 17:13:01 UTC
> *On category pages, <label for="mw-collation-select">Sorting method:</label>
> should have an id or class attribute so people could style it easily.

Actually, I guess its pretty easy to style via #mw-collation-selector label
Comment 10 Greg Grossmeier 2013-07-11 17:19:02 UTC
For completeness's sake: Liangent will be meeting up with the WMF Language team at Wikimania this year to go over what needs to be done/etc for this to go out. Please feel free to continue working on this before then, but there is no set deploy target date until after Wikimania.
Comment 11 Siebrand Mazeland 2013-07-11 18:59:17 UTC
Pasting feedback that was given by Pau Giner on 2013-05-24 after a request from Tim Starling.

"I made a review of the UI and provided some design ideas to solve potential issues. I'm not familiar with Chinese nor Chinese collation methods, so feel free to correct me if I made any wrong assumption in my analysis:

* The use of technical linguistic term such as "collation" although correct may be confusing to regular users. "Sorting" seems a more common term that will allow to unify sorting-related options (more on this later).

* The control breaks the heading layout in the current position. The line of the heading appears broken. To avoid this, I would move the selector below the heading line since the action affects the elements below the header.

* Current ordering is communicated by the list itself, so we may consider making the selector more compact (e.g., using an icon with a clarification tooltip).

* Not sure if this was considered, but if there is a collation method that is most commonly used it should be the default. It may be also interesting to remember which is the collation method the user selected last and use it as the default for the user. 

I know that the specific purpose of the extension is to support Chinese collation, but my concern is that when combining many different extensions the resulting UI gets inconsistently crowded, making it hard to access the great functionality provided by each individual extension.

To avoid this, I would propose to create a unified entry point for sorting-like functionality that can be used consistently at different parts of the UI. I made a quick mockup to illustrate the idea:  http://i.imgur.com/1uZD8nF.png "
Comment 12 Bawolff (Brian Wolff) 2013-07-11 20:59:18 UTC
>* The control breaks the heading layout in the current position. The line of
>the heading appears broken. To avoid this, I would move the selector below the
>heading line since the action affects the elements below the header.

Just as a note, it affects the elements below the next 3 headings, not just the heading it is beside.


>Not sure if this was considered, but if there is a collation method that is
>most commonly used it should be the default. It may be also interesting to
>remember which is the collation method the user selected last and use it as the
>default for the user.

Given that Liangent added a user preference for preferred sorting, this seems like a good idea to maybe make altering which sorting method was used change that preference. The only possible worry I would have is in the case of {{DEFAULTCOLLATION:...}} being specified, the interaction between remembering the user's last choice, and the collation being overridden on a per-page basis, might be unclear to the user. But I think that's a minor concern.
Comment 13 Bartosz Dziewoński 2013-09-27 08:22:43 UTC
What's the progress on this?
Comment 15 Bartosz Dziewoński 2014-05-09 15:51:59 UTC
Reedy has just refurbished https://gerrit.wikimedia.org/r/#/c/87288/ which cherry-picks the first commit from the branch onto master, and I think he's working on the following patches.

Tim, any chance of technical/performance review from you? :)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links