Last modified: 2014-11-17 10:35:38 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T5665, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 3665 - Auto-detect interface language for anonymous users
Auto-detect interface language for anonymous users
Status: REOPENED
Product: Wikimedia
Classification: Unclassified
Language setup (Other open bugs)
unspecified
All All
: Normal enhancement with 12 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
UniversalLanguageSelector-fixed
: i18n, patch, patch-need-review
: 7761 26506 (view as bug list)
Depends on: uls-deployment
Blocks:
  Show dependency treegraph
 
Reported: 2005-10-09 17:38 UTC by Daniel Kinzler
Modified: 2014-11-17 10:35 UTC (History)
13 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
patch against HEAD for Defines.php, DefaultSettings.php, OutputPage.php, and User.php (5.44 KB, patch)
2005-10-09 17:39 UTC, Daniel Kinzler
Details

Description Daniel Kinzler 2005-10-09 17:38:34 UTC
I propose to include a feature that auto-detects the inferface language for
anonymous users. This would be especially helpful for multilingual projects like
the commons. The language can be set in the user interface, but one needs to
understand the default language in order to even create an account, or find the
right setting.

The detection has three modes, controlled by $wgDetectLanguage:

* LANG_USE_CONTENT: use the content language for anonymous users, i.e. dont use
auto-detection. This is the default, and shows the same behaviour as without
this patch.
* LANG_PREFER_CONTENT: use the conten language if present in the Accept-Language
list. Otherwise, behave like LANG_PREFER_CONTENT
* LANG_PREFER_CLIENT: use the first language in the Accept-Language list that is
supported by the wiki.

Caveats:
* the Accept-Language field is often not configured correctly in the browser. 
* the Accept-Language field would effect caching - the appropriate changes to
the Vary: header are done automatically, but this reduces cache efficiency.
* in order to decide which languages are supported, this relies on
$wgContLang->getLanguageNames(). It does not actually check for the files to
exist, as this would be pretty slow, and the detection is performed for every
page request.
* The languages in Accept-Languages are handeled as being given in the order of
preference. Any weight-modifiers are ignored.

patch to follow in a minute.
Comment 1 Daniel Kinzler 2005-10-09 17:39:44 UTC
Created attachment 963 [details]
patch against HEAD for Defines.php, DefaultSettings.php, OutputPage.php, and User.php
Comment 2 Brion Vibber 2005-10-10 07:31:21 UTC
I worry this is too hostile to caching; without aggressive caching of anon views our 
entire infrastructure will collapse into a little pile of rubble. Adding a Vary on 
accept-language will splinter things a lot -- even where the same language gets selected 
there can be big variance in what's in the header.
Comment 3 Daniel Kinzler 2005-10-10 11:19:33 UTC
I agree that this meight be a problem. It meight be quite useful as an option
for smaller, non-Wikimedia projects, though.

The only Wikimedia project I would suggest to enabled this for is the Commons.
There, the main traffic is image data anyway, which can be cached independently
of user language, and even for logged in users. May be worth a try.

Also, it would be interresting to find out how many different Accept-Language
headers we are actually seeing. Most people never change it by hand, and the
default setup of the popular browsers doesn't vary too much, I guess.

All this being said: one important point would be to provide a localized
interface for creating an account to people that do not speak english. Maybe it
would be enough to have a language selection on the login page, that would be
used while creating an account. The default choice could be made based on
Accept-Language. The language selected during account creation should then also
become the language pre-set for the new account. But that would be a separate
feature request, I guess.
Comment 4 Daniel Arnold 2006-03-26 21:36:44 UTC
I also think that the main data of Wikimedia Commons caching is not the text (web site) data but the images.

At first Wikimedia Commons has never been prominently cited by the media (AFAIK) despite its central role (and its 
potential being a serious competitor to traditional stock image archives). Luckily all medias did look at Wikinews 
from day zero on. :p

The second thing is that Commons has not an intuitive URL like Wikipedia: wikipedia.org vs. commons.wikimedia.org.

Most "outsiders" (people not involved in any Wikimedia wiki) coming to Wikimedia Commons come there via Wikipedia 
image descriptions or Google images. Both ways are not used by the masses.

So I am quite confident that this patch enabled for Wikimedia Commons only would help us a lot within several fields 
without harming our caching architecture:

* People coming from local wikis and creating an account will now have the possibility getting (the main page and) 
account creating pages in their native language if you link special:userlogin in the local project (as long as 
single site login is not possible). This would help use reducing precudices against Wikimedia Commons a lot (sadly 
we have to live with these "english only" precudices against Commons currently, although we are working hard 
supporting many languages in a decent way).

* Outside people that want to reuse Wikimedia Commons get information in their local language, which would help us a 
lot as people often ask about Wikimedia Commons conditions (and are apparently confused a little bit by the english 
only interface; despite the problem that Commons help pages weren't the best ones until recently).

* We could reduce the English bias in Commons. Currently we have the problem that people are softly forced into 
English and thus do not realise that their local language is supported too (you know changing of preferences is not 
done by the masses even after login...). This leads to the problem that these non-English languages get neglected 
somewhat. A self strengthening effect supporting mainly English only... Many problems in Commons are caused by the 
lack of local language support. I personally do currently some work supporting help pages in several languages in a 
decent way but the more people you get from the beginnings the better is the result (and I also do only speak 
German, English and French)...

So I think these patches would be a great thing for Wikimedia Commons.
Comment 5 Rob Church 2006-03-26 21:45:33 UTC
The whole point is that if this great thing kills the site, it's not such a
great thing, is it?
Comment 6 Daniel Arnold 2006-03-28 18:12:10 UTC
Well Rob could you give us some serious figures? I outlined quite detailed and with a rational 
analysis from my perspective why an negative impact on the servers is not going to happen if this 
patch is applied to Wikimedia Commons wiki *only*. For sure I could be wrong so in order to get a 
rational discussion of the issue we need the following figures:

* How many image traffic is caused by anonymous page visits in Wikimedia Commons per month?
* How many (page) text traffic is caused by anonymous page visits in Wikimedia Commons per month?
* How many page visit traffic is caused per month by logged in users to Wikimedia Commons?

I'd appreciate if you can afford the time extracting these numbers as I personally do invest quite 
some time as well providing you decent bug reports in order to reduce your amount of work needed to 
solve that bug reports.
Comment 7 Daniel Kinzler 2006-03-28 18:18:09 UTC
I think the best thing for now would be to add a language selector the to
account creation page, as described above. The selection could be pre-set based
on browser preference, but that's not even necessary. The important thing is
that people joining commons hsould not have to know english to create an
account. If Commons is internationalized enough yet that it will be useful to
them without a basic level of english is another question... I at least hope it
will soon be usable without knowing english.

So, shall I open a separate feature request for that?
Comment 8 Rob Church 2006-03-28 18:33:17 UTC
(In reply to comment #7)
> So, shall I open a separate feature request for that?

An optional language selector would be cool.
Comment 9 Daniel Kinzler 2006-04-20 15:23:40 UTC
Side note: this could be combined with bug 5638 to make multilingual projects
like commons more useful to people not speaking english, even if not logged in. 

Perhaps using the browser's language setting is not a good idea - maybe it would
be better to offer a drop down manu and a "set language" button that would set a
cookie.
Comment 10 Brion Vibber 2006-10-30 14:39:30 UTC
*** Bug 7761 has been marked as a duplicate of this bug. ***
Comment 11 Bawolff (Brian Wolff) 2010-12-30 12:45:22 UTC
*** Bug 26506 has been marked as a duplicate of this bug. ***
Comment 12 dohnp5a1 2010-12-30 19:46:09 UTC
It's very dissapointing that nothing was improved anent this bug during six years.

Many people in my surrounding really hate English and would never contribute to a project that appears totally in this language, without an easy and fast way to switch it.

Maybe the browser language settings are sometimes incorrect. Nevertheless, the
present default setting is incorrect almost always, displaying everything in
English to everyone. Most users also don't know how to change the language
settings in Meta or Commons: it is much easier to get to know own browser once
than particular settings of every web visited.

I suppose the browser setting can be set defaultly to English (often
incorrect), or to the system or browser localisation language (probably nobody
uses them in an unintelligible version, so there's no problem). All the possibilities are better than default English always.
Comment 13 Roan Kattouw 2010-12-31 14:21:20 UTC
(In reply to comment #12)
> It's very dissapointing that nothing was improved anent this bug during six
> years.
> 
Although I agree with your argument that English always is not necessarily nice, there's a technical reason we haven't done anything in six years, presented in comment #2: Squid caching would suffer severely. The "pile of rubble" part may not be as accurate today as it was in 2005 (we gained some capacity since then), but please understand this is not an easy change at all. Back in 2005 our servers really did rely on every anonymous user seeing the same thing at the same URL for the servers not to melt down; and I'm not so sure Accept-Language detection for anonymous users would be feasible in 2010/2011 either.

> Many people in my surrounding really hate English and would never contribute to
> a project that appears totally in this language, without an easy and fast way
> to switch it.
> 
"an easy and fast way to switch it" might just be what we *can* do. We could use JavaScript to obtain the user's Accept-Language preferences from the API or something (which wouldn't go through Squid cache, but that's OK: it's just a language list, not an entire wiki page) and use that information to display a link with the native language name (i.e. 'Deutsch' for German, 'Français' for French, etc.) that would then lead to the account creation form in that language or maybe trigger persistent uselang (language selection for anonymous users, basically) if and when we have that.

In fact, I once wrote some proof-of-concept code that obtained the user's Accept-Language settings from the API, stored it in a cookie (to avoid repetitive API requests) and used it to reorder the "In other languages" links in the sidebar. We never ended up using it but it's still lying around somewhere.

tl;dr: Automatically showing wiki pages in the browser language for anonymous users is probably not gonna happen, but a feature offering to switch languages based on the browser language isn't hard to do.
Comment 14 Niklas Laxström 2010-12-31 14:31:59 UTC
If WMF cannot do it, it doesn't mean MediaWiki cannot do it. In fact the LanguageSelector extension does it already. In my opinion it would be nice to pick the automatic language detection code from it to core (disablable for WMF and other cached sites of course).

Lets not mix two issues in this bug.
Comment 15 dohnp5a1 2011-01-02 14:49:06 UTC
Well, should I create a new issue, requesting a language switcher for Commons (ideally accessible on every page, not only on the main one), that would trigger persistent uselang, so that the interface language could stay the same even after clicking links?
Comment 16 Mormegil 2011-01-02 15:46:25 UTC
(In reply to comment #15)
> Well, should I create a new issue, requesting a language switcher for Commons
> (ideally accessible on every page, not only on the main one), that would
> trigger persistent uselang, so that the interface language could stay the same
> even after clicking links?

There already is one, setting “persistent” uselang, used when coming from another Wikimedia project. Try going from e.g. http://cs.wikipedia.org/wiki/File:Example.jpg (not logged in) to the image page on Commons, you should get uselang=cs automatically. See http://commons.wikimedia.org/wiki/MediaWiki:PersistentUselang.js
Comment 17 dohnp5a1 2011-01-02 15:59:23 UTC
It's nice, but we need a language switcher for not logged users, setting such a persistent uselang, on every page (or at least on the main one). Where should it be sorted out? Here or directly somewhere on Commons?
Comment 18 dohnp5a1 2011-01-22 11:39:41 UTC
The switcher now exists but is not perfectly permanent, disappears always after searching a string in the search field.
Comment 19 Krinkle 2011-01-22 21:24:37 UTC
(In reply to comment #18)
> The switcher now exists but is not perfectly permanent, disappears always after
> searching a string in the search field.

Please report any bugs at:
http://commons.wikimedia.org/wiki/MediaWiki_talk:AnonymousI18N.js

The script can be seen at:
http://commons.wikimedia.org/ (logged out)
The source is at:
http://commons.wikimedia.org/wiki/MediaWiki:AnonymousI18N.js
Comment 20 Krinkle 2011-01-22 21:29:10 UTC
This has been done both from javascript in the front-end (see previous comment).

And in the core/php (server side) in the following extensions:
http://www.mediawiki.org/wiki/Extension:LanguageSelector

Knowing that extension is in use on TranslateWiki and is doing pretty well I'd recommend closing this bug and directing further questions to either that extension or to a new bug (eg. "Fix bug X in Extension:LanguageSelector" or "Merge Extension:LanguageSelector in core (disableable)").
Comment 21 dohnp5a1 2011-01-22 22:21:51 UTC
Well, I announced it as a new bug: https://bugzilla.wikimedia.org/show_bug.cgi?id=26876.
Comment 22 dohnp5a1 2011-05-21 09:18:26 UTC
The switcher exists and its uselang is permanent at Commons, it is nice.

Nonetheless I do not understand why couldn't Commons detect the browser default language and set the interface according to that for non-registered users as well, if the switcher hasn't been used. What's the problem?

* Many users don't have set in their preferences in the browser – as far as I know, the default value is English there, so they will receive the Commons interface in English, the same way as now.

* For users having it set, the interface would be in the preferred language.

For nobody it would be worse, just better for one part of the users. Why not?
Comment 23 Roan Kattouw 2011-05-21 17:14:46 UTC
(In reply to comment #22)
> The switcher exists and its uselang is permanent at Commons, it is nice.
> 
> Nonetheless I do not understand why couldn't Commons detect the browser default
> language and set the interface according to that for non-registered users as
> well, if the switcher hasn't been used. What's the problem?
> 
> * Many users don't have set in their preferences in the browser – as far as I
> know, the default value is English there, so they will receive the Commons
> interface in English, the same way as now.
> 
> * For users having it set, the interface would be in the preferred language.
> 
> For nobody it would be worse, just better for one part of the users. Why not?
This could be done for logged-in users, I guess, but it definitely can't be done for anonymous users due to Squid caching. The browser language headers can't be detected client-side, only server-side.
Comment 24 Bawolff (Brian Wolff) 2011-05-21 18:01:28 UTC
Also, many people have their browsers languages misconfigured. Since those settings are hard to find (generally), its often very unclear to the user why they are getting language x vs language y. Any use of browser headers should have clear ways in the interface to change the auto-detected defaults.

As for detecting language client side - you can always do an ajax like http://en.wikipedia.org/w/api.php?action=query&meta=userinfo&uiprop=acceptlang
Comment 25 dohnp5a1 2011-05-21 20:03:40 UTC
Yes, they maybe have them misconfigured, but in fact it means "Not configured", default, in other words they have English there on the first place – so nothing would change for them, as now they have the interface in English as well. For users having the browser configured it would be better: Why to configure the browser language setting, if the webs neglect it?

I really do not understand other thing now. Being not logged in, having the cache renewed and browsing anonymously with Mozilla, with Slovak in the language setting on the first place there, being in Portugal – in Commons there is a Czech notification "Wikimedia Commons is available in Czech". From where does the site take the language information? I thought it is the language setting, but obviously not, as now I prefere there Slovak, nonetheless nothing changed in Commons, it still offers Czech (but unfortunatelly just offers, it doesn't display the interface in that language).
Comment 26 Niklas Laxström 2011-07-11 12:36:35 UTC
It is unclear whether this bug is about having this feature in MediaWiki (exists in an extension) or in the Wikimedia projects (not done). Assuming the first since this bug is categorized as MediaWiki bug.
Comment 27 dohnp5a1 2011-07-11 16:29:21 UTC
In my understanding, the bug is about having this feature in Wikimedia Commons.
Comment 28 Mormegil 2011-07-11 20:08:40 UTC
(In reply to comment #25)
> I really do not understand other thing now. Being not logged in, having the
> cache renewed and browsing anonymously with Mozilla, with Slovak in the
> language setting on the first place there, being in Portugal – in Commons there
> is a Czech notification "Wikimedia Commons is available in Czech". From where
> does the site take the language information? I thought it is the language
> setting, but obviously not, as now I prefere there Slovak, nonetheless nothing
> changed in Commons, it still offers Czech (but unfortunatelly just offers, it
> doesn't display the interface in that language).

Once again, see http://commons.wikimedia.org/wiki/MediaWiki:AnonymousI18N.js and its talk and discuss that script _there_. If you read that page, you would learn the user language should be selected using the following priorities:

1. Cookie (previous user preference)
2. According to the previous (referring) page (e.g. when you click on a Commons link on the Czech Wiktionary, you’ll get Commons in Czech)
3. Browser language
4. Fallback to the default language
Comment 29 Sumana Harihareswara 2011-12-21 17:12:33 UTC
The Indic language community is interested in this feature.  I do not have time to summarize it and am not sure I would summarize adequately.  The thread, for anyone who wants to read through it:

http://lists.wikimedia.org/pipermail/wikimediaindia-l/2011-December/thread.html#5890

I've asked them to come here and detail what they want.
Comment 30 Bawolff (Brian Wolff) 2011-12-21 22:48:22 UTC
(In reply to comment #29)
> The Indic language community is interested in this feature.  I do not have time
> to summarize it and am not sure I would summarize adequately.  The thread, for
> anyone who wants to read through it:
> 
> http://lists.wikimedia.org/pipermail/wikimediaindia-l/2011-December/thread.html#5890
> 
> I've asked them to come here and detail what they want.

My impression of the thread is they want a big site banner "View wikipedia in language X" with X being auto-detected via either geo-location or accept-language headers (aka your web browsers lang prefs). That isn't really this bug, otoh doing that is more likely to be implemented then this bug (since it can be done in pure js so low amount of caching issues, and most of the work is already done as we already can get accept-language headers from js ( http://www.mediawiki.org/w/api.php?action=query&meta=userinfo&uiprop=acceptlang ) and geo-location is also already set up for js as a side affect of geo targeted central notices.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links