Last modified: 2013-05-14 14:29:43 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T50097, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 48097 - Request Sorting Thai Wikipedia (and sister projects) with UCA
Request Sorting Thai Wikipedia (and sister projects) with UCA
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
wmf-deployment
All All
: Normal enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
: shell
Depends on:
Blocks: 30996
  Show dependency treegraph
 
Reported: 2013-05-05 08:29 UTC by Octra Bond
Modified: 2013-05-14 14:29 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
sorting test on my localhost (plz don't care the category name) (4.01 KB, image/png)
2013-05-05 08:56 UTC, Octra Bond
Details
prior incorrect sorting result (3.85 KB, image/png)
2013-05-05 09:01 UTC, Octra Bond
Details

Description Octra Bond 2013-05-05 08:29:27 UTC
Nowadays, we have confronted the problem about sorting pages in categories;
they are grouped only very first letter.
But the nature of Thai language has some front-vowels -- เ แ โ ใ ไ -- before consonants
such as "เดียว", it should be ordered under "ด", should not be under "เ".

Additionally, tone marks are treated as primary sortkeys
that causes the ordering messes up,
"ด้วย" comes after "ดิน" which is incorrect for example.

Some time ago, a makeshift is made for the problem by using DEFAULTSORT.
But consequently, it must be added everytime we created them;
it labours our users.

Since 1.17, there is a correct way to sort these pages:
$wgCategoryCollation is born to solve the problem.

UCA has pretty good collation for Thai characters (and other alphabets too):
the front-vowels are sorted after their consonants,
the tone marks are sorted as secondary level.
http://www.unicode.org/charts/uca/

I must request adding this to Thai Wikipedia (and also other Thai sister projects):

$wgCategoryCollation = 'uca-default';

This will solve the long-time issue that we have experienced.
(Please don't forget to run update script.)

I have tried in my localhost and it works well.
"ด้วย"→"ดิน"→"เดียว" are correctly sorted under "ด" without DEFAULTSORT.
Comment 1 Octra Bond 2013-05-05 08:56:41 UTC
Created attachment 12261 [details]
sorting test on my localhost (plz don't care the category name)
Comment 2 Octra Bond 2013-05-05 09:01:47 UTC
Created attachment 12262 [details]
prior incorrect sorting result
Comment 3 Gerrit Notification Bot 2013-05-14 14:07:16 UTC
Related URL: https://gerrit.wikimedia.org/r/63661 (Gerrit Change I850aac8ca4da89b5e3c27178b26c3d98da02b235)
Comment 4 Sam Reed (reedy) 2013-05-14 14:29:43 UTC
All done

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links