Last modified: 2014-11-17 10:35:35 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T4867, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 2867 - Interwiki lists sort in phonetic, site-defined order
Interwiki lists sort in phonetic, site-defined order
Status: NEW
Product: MediaWiki
Classification: Unclassified
Interface (Other open bugs)
unspecified
All All
: Low enhancement with 62 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
http://meta.wikimedia.org/wiki/Interw...
:
: 15990 28156 (view as bug list)
Depends on:
Blocks: 40760 41348
  Show dependency treegraph
 
Reported: 2005-07-15 07:07 UTC by Yuri Astrakhan
Modified: 2014-11-17 10:35 UTC (History)
16 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch to allow sorting customization (per language) (11.97 KB, patch)
2005-07-24 01:12 UTC, Yuri Astrakhan
Details

Description Yuri Astrakhan 2005-07-15 07:07:06 UTC
Many site's admins have talked about making the bot framework order interwiki
lists according to the site's requirements. I think this should be the feature
of the mediawiki instead. Site's admin should be able to specify that for the
scandinavian site that she manages, all other scandinavian links should be
listed first, after which they can be ordered alphabetically. ( see
http://en.wikipedia.org/wiki/User_talk:Yurik#Nynorsk_Wikipedia )

On top of the default configuration, the user should be able to override those
settings. For example, when i browse a russian wikipedia, i would like english
and french listed first.
Comment 1 Yuri Astrakhan 2005-07-24 01:12:38 UTC
Created attachment 740 [details]
Patch to allow sorting customization (per language)

Attached is the proposed changes to skin.php, language.php, and names.php. With
this patch (tested on my own mediawiki installation), each language can be
customized to make any languages appear first in the interwiki list.
Comment 2 Olve 2005-07-24 04:19:19 UTC
Great work, Yuri!

Will the interwiki links chosen to be on top be repeated in the general list, or are they “used up”? Is that (to repeat or not to 
repeat) configurable in your solution? In either case, this seems to be a solid step forward! -Olve ( http://nn.wikipedia.org/wiki/
Brukar:Olve )
Comment 3 Yuri Astrakhan 2005-07-24 08:20:54 UTC
First, why this patch was created (from IRC discussion:)

1) interwikies should be sorted in a "phonetic" order - this way finish language
suomi (code 'fi') goes after russian ('ru')
2) some languages are closelly related - like slavik or scandinavian. They want
their sister sites to be listed first
3) some sites may prefer to have english as their first choice simply because of
being lingua-franca
4) this opens the path for user-level sorting - i, as a user, would like to keep
languages that i know at the top, instead of looking through 50 different links


The iw links are not repeated - if you want 'en' to appear at the top of the
list, it will not be included the second time in the regular phonetic order.
Otherwise imagine if the page has only few common links, all of a sudden they
are all duplicated.
Comment 4 Olve 2005-07-24 08:30:25 UTC
(In reply to comment #3)
OK -- the question arose locally on nn:, so I had to ask... :)
It seems to me that the advantages of this new solution
outweigh the disadvantages by far, so I am all for it.
-- Olve
Comment 5 Yuri Astrakhan 2005-07-29 18:21:39 UTC
The patch should be fixed to allow site administrators to change local ordering
without asking devs to change the language_XX.php file, similar to the way
localization and other items can be done.
Comment 6 Olve 2005-10-16 20:50:06 UTC
Twenty votes for this bug and no reaction from any developer except the one who posted the bug... (Thank you, Yuri!)  
I hope that something can start happening here soon!
Comment 7 Roland Bär 2005-11-16 08:46:30 UTC
I think they are ignoring it, cause they probably think this results in a
performance decrease.
Many interwiki links are submitted using Python Wikipedia Robot Framework
http://sourceforge.net/projects/pywikipediabot/
We should probably fix this there in interwiki.py. Also, we could make a machine
readable conventional page Wikipedia:interwikiconv to describe
that conventions.
Comment 8 Minh Nguyễn 2005-11-16 19:17:40 UTC
Two popular conventions are detailed in the comma-separated string format at
[[m:Interwiki sorting order]].
Comment 9 Yuri Astrakhan 2005-11-17 21:20:01 UTC
1) Performance - sorting a small array of strings is not very expensive if you compare it with database and 
bandwidth limitations. With the current state of CPU power, its really negligible. Compared with the other 
processing, such as tons of regex expressions and parsing dates to change time zone is substantially more CPU 
intensive.
2) Interwiki bot - there is already code in there that sorts interwikies. The missing part is the ability for 
individual site admins to easily alter the list.
3) Implementing this in mediawiki will allow per-user customizations (I would like to see the languages I know 
first!)
4) I think the reason for no activity is two-fold:
  A) The existing patch does not allow for dynamic customizations using special:AllMessages page. (need to 
rewrite that)
  B) (speculating) English version tend to have much higher priority than other sites, and this is clearly an 
internationalization issue.
Comment 10 Jon Harald Søby 2006-03-18 10:54:35 UTC
So, will this patch be added? It would be extremely useful (and I know a lot of
Wikipedias are eager to get this).
Comment 11 Yuri Astrakhan 2006-03-18 16:27:00 UTC
It's too low on developer's priority :(, thus more compaining has to be done to get the vote rate up, as well as ask them about 
it on the IRC channel at irc://irc.freenode.net/mediawiki .
Comment 12 Rob Church 2006-03-18 16:33:53 UTC
Alright, hold up. First of all; yes, we have priorities. No, we don't ignore
patches. I'm going to review it right now, in fact. Performance MIGHT be an
issue, as might the effects on caching, so those will need to be taken into
consideration. Plus a user above mentioned something about localisation, so I
need to check that's been done properly. But don't make blanket, "they don't
care" statements, because it's patently false.
Comment 13 Yuri Astrakhan 2006-03-18 16:47:12 UTC
Rob, noone is accusing developers of not carying - the fact the wikipedia is still up and improving fast is the testiment to 
that! What is at issue is bringing what's important to the various users of wikipedia to developerers attention, to show which 
features are of higher value, and which might be delayed. That is why vote counts and telling site admins of a possible 
solution that may (or may not) be helpful are important.

The localization I mentioned above is not about localization, but the process of changing settings with Special:AllMessages 
page rather than modifying language.php file (the way the patch currently does it).  The code should be changed to allow 
Special:AllMessages method to be trully usable.

Thank you for your help with the issue!
--Yuri
Comment 14 Brion Vibber 2006-03-18 22:05:16 UTC
Didn't I wontfix this already? Order should be consistent across all languages to 
aid in navigation, *not* "site-specific".
Comment 15 Jorunn 2006-03-18 22:30:38 UTC
Any list of 200+ language names in 200+ different languages is *not* aiding 
navigation. The list should be user configurable. 
Comment 16 Olve 2006-03-19 00:56:37 UTC
Brion:

Yes, I do believe you "wontfix"ed this one. Errare humanum est... ;-)

As many people have already pointed out here, it is important for many
wikipedias (especially in smaller and/or localised languages) that the display
order is one that draws out those languages that are actually of use to the
"average" reader of that language.

An additional reason for having this system is that it will make sorting
independent of code order on the edit page. In practice, this means:

# The input order can be strictly alphabetical by code. This is a vast advantage
for interwiki fixers, since they don't have to know the specific policy of each
wikipedia.
# The displayed interwikis can be arranged according to local needs:
## Alphabetically by language name. This order is convenient for wikipedias in
the largest international languages, such as the English, Spanish, Portuguese,
Chinese, French and Arabic ones.
## Local or related languages first, then others alphabetically by language
name. This is extremely practical for closely-related language clusters like
Swedish/Danish/Bokmål/Nynorsk, Serbian/Croatian/Bosnian, Czech/Slovakian,
Hindi/Urdu/Panjabi, as well as for many of the smaller languages within,
''e.g.'', the Germanic and Romance language groups. 
## The locally best-understood international languiage/s first (''e.g.''
English, others; French, Spanish, others; or French, Portuguese, others). This
order is particularily helpful for wikipedias in languages which are in their
initial stage of building a (or any) encyclopaedia and which have stronger links
to this/these international language/s than to local/related languages. 

Please take this request seriously (no "wontfix", in other words) even though
the matter seems unimportant to the wikipedias you work on. There are plenty of
other wikipedias that would benefit greatly from such a project!

Respectfully,

Olve
Comment 17 Bjarte Sorensen 2006-03-19 10:57:04 UTC
Seconded,

Bjarte Sorensen
Comment 18 Trond Trosterud 2006-03-19 11:52:25 UTC
Yes, it is important (I am active on the nn, se and fi wikipedias, and the average reader of these wikipedias is capable of reading 5 other 
wikipedias. We thus want these neighbouring ones to be listed first). Trond.
Comment 19 stajohns 2006-03-23 07:59:53 UTC
Given the significant growth of smaller language, the length of the interlanguage listings is fast becoming less 
user-friendly. The intended 'at-a-glance'-functionality disappears when I have to browse through a lengthy list 
of languages to find one I know I can read and understand. 

Best regards
Ståle Johnsen
Comment 20 Arp Kruithof 2006-04-27 00:24:16 UTC
By pure coincidence a discussion on nl was just (re)started yesterday about the
weird sort orders by language name due to using the language codes for sorting.
It seems to me that it would be a very usefull feature to have, especially if
configurable at user-level so users can bump their prefered languages up. Of
course, if that would cause significant performance/caching problems it would
still be a very neat first step to have it configurable per wikipedia.

Cheers
Arp Kruithof
Comment 21 Ulf Lunde 2006-04-27 13:04:51 UTC
Here's a voice in favor of the change
from me, too.

Even a per wikipedia generic customizability of the interwiki
order would be a *very* useful feature, and one which I hope
will be given top priority now that all the more serious
shortcomings (that I know of) have been fixed.

If, in addition, each user could (in some simple way) choose
to hide languages which she is not interested in, this could
be used to make the lists shorter and thus more user friendly.

Both "levels" of change should be weighed against the
performance issue; a change which would degrade the wiki's
speed, becomes less desirable.

Verdlanco
Comment 22 Yonatan Horan 2007-04-07 19:23:59 UTC
"Didn't I wontfix this already? Order should be consistent across all languages to
aid in navigation, *not* "site-specific"."

There's already more than one project that has a different order from the
default one. For example, the Hebrew Wikipedia has English come first and
afterwards it goes by order of the language prefix (Finish goes after en and
before it (under f) rather than right before sv (under s for Suomi). Maybe it
should be unified across all wikis but if so, the bots should be modified for
this and if not the option should probably be included in the software.
Comment 23 Raimond Spekking 2008-10-21 10:16:37 UTC
*** Bug 15990 has been marked as a duplicate of this bug. ***
Comment 24 Amir E. Aharoni 2008-12-09 09:31:26 UTC
The most logical default sorting is not phonetic, but Unicode.

Let me explain.

It doesn't actually make too much sense that Finnish (Suomi) would come after Russian (Русский). It does make a little sense, because Cyrillic is somewhat related to Latin - both have a letter for "R", although it looks different. But what if there was a language which is written in the Cyrillic alphabet, and its name begins with a "Ж"? It is transliterated into Latin as "ZH", but a speaker of that language would find it odd if it appeared at the end of the list, because in Cyrillic this letter is close to the beginning. So sorting Русский near the Latin R's happens to make some sense, but it is a lucky coincidence.

This problems occurs with Yiddish (יידיש): It is sorted near the end. Why? Because Y is near the end of the Latin alphabet? But the Hebrew letter י is near the beginning of the Hebrew alphabet in which Yiddish is written.

It makes even less sense that Hebrew (עברית) would come after Italian. It is suggested that it Hebrew would come after Italian, because a simple non-scientific translation of עברית is "Ivrit". The reality, however, is that Hebrew speakers don't think that the first letter of their language's name is an "I", but an "ע" (Ayin), which has no analog in the Latin alphabet; hence, there is no clever way to put עברית in a "phonetic order".

These are just a few of the problems with languages with which i am familiar. I don't know, for example, how convenient it is for a Japanese speaker to find his language at N (for Nihongo, i presume).

The only solution to this is to make the default language names adhere to the order of the scripts in Unicode. This means that language names will be grouped by script: Latin (French, Ban-lam-gu, Estonian), Cyrillic (Russian, Mongol, Sakha), Arabic (Arabic, Farsi, Urdu), Hebrew (Hebrew, Yiddish), Chinese (Mandarin, Cantonese, Yue), Devanagari (Hindi, Nepali) etc. These groups will appear in the order in which they appear in the Unicode standard. It is technocratic, but it is the most neutral way i can think of. Certainly better than putting עברית under I, which is not useful for Hebrew speakers.

And for the record - i support the option to have a language project define languages that will appear at the top. De facto, for Norwegian it's Swedish, Danish et al., for Hungarian and Hebrew it is English etc., and nothing is wrong about it. It makes Wikipedia convenient.
Comment 25 Max Semenik 2009-12-31 18:32:44 UTC
Removing need-review, the patch is out of date. Also, it could be beneficial if:
1) Local communities were able to define sort order themselves, via a system message.
2) The sorter attempted to find a reasonable place for unknown langcodes instead of throwing them to the bottom in undefined order.

The problem is pretty important, by the way.
Comment 26 Olve 2010-11-08 11:15:32 UTC
2006-03-18 16:33:53 UTC, Rob Church wrote:

> No, we don't ignore patches.
> I'm going to review it right now, in fact.

So -- any result yet? ;-)
Comment 27 Chad H. 2011-03-21 15:59:59 UTC
*** Bug 28156 has been marked as a duplicate of this bug. ***
Comment 28 Sumana Harihareswara 2011-11-10 02:19:38 UTC
I'm adding the "reviewed" keyword since the patch has been reviewed and, sadly, the passage of time has obsoleted it, per comment 25.  Thank you for the bug report and the patch, Yuri.

I'm also marking this for the internationalization/localization team to look at, by adding the "i18n" keyword.
Comment 29 Shaurabh Bharti 2012-02-02 22:11:13 UTC
Interestingly, we have similar suggestions for indian wikipedias (hindi, bengali, kannada, telugu etc.). It would be nice to list indian languages first on indian wikis with two options : 1) in general 2) user specific. It increases usability and popularity of/access to smaller wikipedias.
Comment 30 Siebrand Mazeland 2012-02-03 09:55:33 UTC
Adding Denny on CC, as they may be an issue that would be very prominent for the adoption/acceptance of Wikidata based interwiki links.
Comment 31 Sumana Harihareswara 2012-02-06 00:38:37 UTC
Mailing list discussion on wikimediaindia-l: http://lists.wikimedia.org/pipermail/wikimediaindia-l/2012-February/thread.html#6755 starting with http://lists.wikimedia.org/pipermail/wikimediaindia-l/2012-February/006755.html  

Indian community member's request: "automatically sort all the languages according to the language preferences" since "For Malayalam, a list starting from English, Hindi, Tamil, Kannada, Sanskrit, etc. [would be more useful] to many users than providing a list starting [with obscure] languages."  (original: http://lists.wikimedia.org/pipermail/wikimediaindia-l/2012-February/006768.html )
Comment 32 Tim Landscheidt 2012-09-18 20:25:55 UTC
I have submitted an updated patch as Gerrit change #24211.  This allows the sorting order to be set per wiki by the system message "interwiki config-sorting order".  I didn't implement the whole shebang at [[m:Interwiki sorting order]] because I think that the patch will be more than adequate in 99.9 % of all cases.
Comment 33 taweethaも 2013-05-24 06:05:00 UTC
This issue is also raised on Thai Wikipedia.
http://th.wikipedia.org/wiki/วิกิพีเดีย:ศาลาชุมชน/อภิปราย/เรียงลิงก์ข้ามภาษา
I support site-defined/user-defined order.
Comment 34 taweethaも 2013-05-24 06:41:45 UTC
There is additional comments in Thai from the link above that may benefit this threads.

The sorting order for non-registered users may also be defined by the user's  browser setting (e.g. language), cookies, IP address (=location).
Comment 35 Andre Klapper 2013-07-10 13:15:16 UTC
Wondering what is needed to get that rotting patch on its way again...

The internationalization related bits work as expected, so it seems to be more about using that for a customized sort.
Comment 36 Gerrit Notification Bot 2014-03-28 23:23:03 UTC
Change 24211 abandoned by Siebrand:
(bug 2867) Sort interlanguage links.

Reason:
I'm abandoning this as this change hasn't had any love in a long time. There are open comments and it doesn't merge any more. Can be restored if author wants to work on it again.

There's a ULS compact links beta feature now that may replace this.

https://gerrit.wikimedia.org/r/24211

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links