Last modified: 2009-08-25 14:34:34 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 5790 - Having both lang and xml:lang attributes not identical to $wgContLanguageCode
Having both lang and xml:lang attributes not identical to $wgContLanguageCode
Status: RESOLVED INVALID
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Normal normal with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
: 20387 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-05-02 13:38 UTC by 百楽兎
Modified: 2009-08-25 14:34 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Text rendered in zh Wikipedia in IE (36.40 KB, image/png)
2006-05-04 10:29 UTC, Shinjiman
Details
Text rendered in zh Wikipedia in Firefox (90.63 KB, image/png)
2006-05-04 10:30 UTC, Shinjiman
Details
This is the draft version how to detect and change the value in the <html> tag. (Please note that some code cleanup is reqireed before commits) (2.15 KB, text/plain)
2006-05-07 15:03 UTC, Shinjiman
Details
A bit cleanup for the prototype of the code (2.05 KB, text/plain)
2006-05-07 16:01 UTC, Shinjiman
Details
Further cleanup of the prototype code (2.01 KB, text/plain)
2006-05-08 03:12 UTC, Shinjiman
Details
A fine tuned function prototype (2.05 KB, text/plain)
2006-05-08 17:34 UTC, Shinjiman
Details
This is the patch which enable the ability to set a assigned language code at the lang tags (4.90 KB, patch)
2006-05-09 12:33 UTC, Shinjiman
Details
A LanguageTags.php file used with this patch. (428 bytes, patch)
2006-05-09 12:34 UTC, Shinjiman
Details
A flow chart explaining how to determine the language code to be displayed (63.91 KB, image/png)
2006-05-10 01:44 UTC, Shinjiman
Details
modified patch file based on previous patch. (5.20 KB, patch)
2006-05-10 02:59 UTC, Shinjiman
Details
A updated LanguageTags.php file to make this code operating (5.20 KB, patch)
2006-05-10 03:01 UTC, Shinjiman
Details

Description 百楽兎 2006-05-02 13:38:56 UTC
Hi, I am a user from Chinese Wikipedia. I found there is a description "<html 
xmlns="http://www.w3.org/1999/xhtml" xml:lang="zh" lang="zh" dir="ltr">" on the top of every page.
As you know, Chinese has two charsets, Traditional and Simplified, and the default font of them are 
also different. Becasue all pages are specified to be "zh", and by default, "zh" means Simplified 
Chinese (zh-cn), so that users of Traditional Chinese (zh-tw & zh-hk) can not use their default font 
to display pages.
To simply speaking, can the description "xml:lang="zh" lang="zh"" change with user's system default 
charset? Then for example, the description will be "xml:lang="zh-hk" lang="zh-hk"" if user is from Hong 
Kong or his system's charset is zh-hk.
Comment 1 Shinjiman 2006-05-04 10:29:47 UTC
Created attachment 1650 [details]
Text rendered in zh Wikipedia in IE

This is the texts that rendered in different language tags, using IE for
Windows.
Comment 2 Shinjiman 2006-05-04 10:30:44 UTC
Created attachment 1651 [details]
Text rendered in zh Wikipedia in Firefox

This is the texts that rendered in different language tags, using Firefox for
Windows.
Comment 3 Shinjiman 2006-05-04 10:48:23 UTC
This problem has been recently posted at March 2006 in wikitech-l

http://mail.wikipedia.org/pipermail/wikitech-l/2006-March/034397.html

seems no one replies on that issue, it's been suggested to resolve this issue here.
Comment 4 Brion Vibber 2006-05-04 18:56:40 UTC
We could probably have it change the code based on the selected 
variant conversion. is this what you mean?
Comment 5 Shinjiman 2006-05-05 01:07:34 UTC
I think thatit can be done by similar tech that very on the value xxx which using
[xml:lang="xxx" lang="xxx"] instead using the #wgContLangCode directly.

this can be done by adding a piece of code at OutputPage.php, or including another 
file which handles the displaying language ccode. It is suggested the displaying 
language code is based on the both $wgContLangcode and a aeries of checking, by 
these steps:
1. Logged in users can be detected by the interface language, return a value 
according to the table below;
2.1. Anoym users can first detect by the HTTP_ACCEPT_LANGUAGE value, and a value 
according to the table below;
2.2. if step 2.1 failed, just return the #wgContLangCode value;

Note: the return value that returned by the functions are varies by _both_ 
$wgContLangCode and the interface value by user, which:
*If (($wgContLangCode == en) && (user interface language <- zh-tw)) => return en 
(#wgContLangCode)
*If (($wgContLangCode == zh) && (user language language == zh-tw)) => return zh-tw

The table (or array) below is the value that need returned by _both_ 
$wgContLangCode and interface language check (currently lists fot zh only):
* $wgContLangCode == zh
  * if interface language == zh returns $wgContLangCode
  * if interface language == zh-cn returns zh-cn
  * if interface language == zh-tw returns zh-tw
  * if interface language == zh-hk returns zh-tw (for browser compatibility issue)
  * if interface language == zh-mo returns zh-tw (for browser compatibility issue)
  * if interface language == zh-sg returns zh-cn

* but while $wgContLangCode != zh
  * if interface language == en returns $wgContLangCode
  * if interface language == de returns $wgContLangCode
  * if interface language == fr returns $wgContLangCode
  * if interface language == ja returns $wgContLangCode
  * if interface language == ko returns $wgContLangCode

The table above will _not_ using this tech by detecting those two values.
*If (($wgContLangCode == en) && (user interface language <- zh-tw)) => return en

This one is intending _not_ to affect the display language code on other sites 
like in en, de, ft, ja, ko, ... wiki.

This trick also applies the $wgContLangCode is not available in the browser:
*If (($wgContLangCode == zh-min-nan) && (user interface language <- zh-min-nan)) 
=> return en (for compatibility which the browser, including IE6/7 or Firefox does 
not support zh-min-nan tags).
Comment 6 Shinjiman 2006-05-05 01:16:15 UTC
The term [interface language] above means the language that currently used by the 
logged on user (i.e. the [user language]).
Comment 7 Shinjiman 2006-05-05 11:37:00 UTC
(In reply to comment #4)
> We could probably have it change the code based on the selected 
> variant conversion. is this what you mean?

Nope, the interface language and the variant conversion is different stuffs, and
this issue is not releated with the variant conversion.
This issue can be resolved base on the user interface language.
Comment 8 Shinjiman 2006-05-07 15:03:15 UTC
Created attachment 1688 [details]
This is the draft version how to detect and change the value in the <html> tag. (Please note that some code cleanup is reqireed before commits)

I've given a very draft version how to detect and change the <html> tag from
various options, and some code cleanup is needed _before_ commits into the
trunk since the code is not tested yet.
Comment 9 Shinjiman 2006-05-07 16:01:55 UTC
Created attachment 1691 [details]
A bit cleanup for the prototype of the code
Comment 10 Shinjiman 2006-05-08 03:12:03 UTC
Created attachment 1695 [details]
Further cleanup of the prototype code
Comment 11 Shinjiman 2006-05-08 17:34:35 UTC
Created attachment 1696 [details]
A fine tuned function prototype

This is the fine tuned function prototype, it works by calling the function.
However it needs to be fine-tuned with conjunctive operations in the
OutputPage.php.
Comment 12 Shinjiman 2006-05-09 12:33:18 UTC
Created attachment 1700 [details]
This is the patch which enable the ability to set a assigned language code at the lang tags

This is the patch that is use to correct the assosiate language tags with
assigned font, and this patch needs a new file called
"includes/LanguageTags.php" to work with this resolution. :)
Comment 13 Shinjiman 2006-05-09 12:34:43 UTC
Created attachment 1701 [details]
A LanguageTags.php file used with this patch.

This is the file that needs to run with the patch file.
Comment 14 Shinjiman 2006-05-09 12:47:37 UTC
Finally the patch is coming, I hope this patch is a workaround to address the
Language and Font problem on various MediaWiki sites, It's not only designed for
zh sites, other languages can also use this solution to address the Language and
Font problem like als, ang, ast, bat-smg, simple, sr, etc.
Comment 15 Brion Vibber 2006-05-09 20:42:13 UTC
As mentioned above we can't use something that relies on the Accept-
Language header as it would break our caching system. Patch cannot 
be accepted.
Comment 16 Shinjiman 2006-05-10 01:44:19 UTC
Created attachment 1704 [details]
A flow chart explaining how to determine the language code to be displayed

Firstly, I think I need to send a flow chart to explain whether my concept is
correct, then as per suggestions we got, write a code to resolving this
problem. :)
Comment 17 Shinjiman 2006-05-10 02:59:46 UTC
Created attachment 1705 [details]
modified patch file based on previous patch.

Anyway, I uploaded a patch file on my previous patch to resolve the primary
problem on state issue in some cases. (For example, using a zh-tw interface in
a zh-yue site).
Comment 18 Shinjiman 2006-05-10 03:01:09 UTC
Created attachment 1706 [details]
A updated LanguageTags.php file to make this code operating

This is the updated LanguageTags.php file to make the new patch working.
Comment 19 Shinjiman 2006-05-10 03:05:57 UTC
(In reply to comment #15)
> As mentioned above we can't use something that relies on the Accept-
> Language header as it would break our caching system. Patch cannot 
> be accepted.

The Accept-Language header is applicable when the browser supports that and
enabled that, if this method fails, it would take the $wgContLanguageCode directly.

But no idea why this would break the cache system......??? or is that my patched
code is placed in the location that not suits in those files? :)
Comment 20 Shinjiman 2006-05-10 03:10:10 UTC
Accept-Language header check only applicable for anonymous users, it would take
the $wgLanguageCode directly if above method fails.
For logged-in users, it would take the interface language in user perferences to
determining the Language Tag.
Comment 21 Brion Vibber 2006-05-10 10:59:48 UTC
The patch seems to be trying to do something totally different 
from what's described in the summary, and by changing the 
output based on unsafe headers it would break caching. I'm 
marking this INVALID; please replace with a more directed 
issue.
Comment 22 Shinjiman 2006-05-10 11:08:25 UTC
I've bring this issue into the wikitech-l maillist for further discussion until 
this issue is resolved.

Gname discussion direct link:
http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/23533
Comment 23 Shinjiman 2006-05-11 11:16:16 UTC
As mentioned before, I've changed the summary title to suits the situation we're 
having. And also a non suitable patch != invalid bug report. Hence, I've REOPENed 
the bug again to resolving this issue.

By the way, I've been conducting a survey to having the enquiry for the users in 
the local wiki (http://zh.wikipedia.org/wiki/User:Shinjiman/LanguageTags) to 
asking their user interface language and the language variant that they're using. 
Therefore it's seems impossible to solve this issue according the language 
variants.

See also the mail at wikimedia-l for more detailed information regarding to this 
issue:
http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/23542 and
http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/23573
Comment 24 百楽兎 2006-05-11 15:21:25 UTC
(In reply to comment #21)
> The patch seems to be trying to do something totally different from what's described in the 
summary, and by changing the output based on unsafe headers it would break caching. I'm 
marking this INVALID; please replace with a more directed issue.

Sorry, Brion, I don't understand why they are unsafe headers. Could you describe it more clearly? 
And could you advise a safe way to achieve our goal? Thank you.
Comment 25 Brion Vibber 2006-05-11 20:56:38 UTC
Cache. Cache. Cache. And, cache.

Shinjiman, your mail to wikitech-l makes even less sense.
Please see my reply there.
Comment 26 Chad H. 2009-08-10 00:20:41 UTC
As far as I have been able to tell: the lang attribute on <html> is set to the content language. Nothing in my testing indicates this isn't working 100% as intended. xml:lang attributes are unneeded in HTML5 anyway, which is what we're moving towards.

Resolving INVALID.
Comment 27 Shinjiman 2009-08-10 05:33:52 UTC
How about the lang attribute in the HTML 5?
I think the lang attribute is stilll needed in the HTML 5, according to 
http://dev.w3.org/html5/markup/common-attributes.html#common-attributes .
Comment 28 Chad H. 2009-08-25 14:34:34 UTC
*** Bug 20387 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links