Last modified: 2014-05-07 07:12:03 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T24985, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 22985 - Forcing the language for {{PLURAL}} rule
Forcing the language for {{PLURAL}} rule
Status: NEW
Product: MediaWiki
Classification: Unclassified
Internationalization (Other open bugs)
unspecified
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: i18n
Depends on: 9360
Blocks: plural
  Show dependency treegraph
 
Reported: 2010-03-28 09:19 UTC by Pavel Selitskas [wizardist]
Modified: 2014-05-07 07:12 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Pavel Selitskas [wizardist] 2010-03-28 09:19:02 UTC
Just learnt about MediaWiki using {{PLURA}} with the base lanuguage only. That's enough for general wikis, but a little problem for multilanguage projects like Wikimedia Commons.

So is it the way to allow users to specifiy which language to use with PLURAL statement? For example, {{PLURAL:be:{{NUMBER}}|one|for|eight}}. Is it real to be done?
Comment 1 Priyanka Dhanda 2010-03-30 22:18:15 UTC
Chatted with Nikerabbit on #mediawiki-i18n about this. He thinks the best way to do this would be to make the parser aware of language context and not add extra hacks to the individual parser functions.

I think that makes sense. I can look into it soon.
Comment 2 Pavel Selitskas [wizardist] 2010-03-31 08:26:42 UTC
(In reply to comment #1)
> Chatted with Nikerabbit on #mediawiki-i18n about this. He thinks the best way
> to do this would be to make the parser aware of language context and not add
> extra hacks to the individual parser functions.
> 
> I think that makes sense. I can look into it soon.

Yeah, guys at #mediawiki  said me that the parser doesn't know which language does the engine operate. 

Thanks in advance. :)
Comment 3 Pavel Selitskas [wizardist] 2010-03-31 22:59:07 UTC
(In reply to comment #1)
> Chatted with Nikerabbit on #mediawiki-i18n about this. He thinks the best way
> to do this would be to make the parser aware of language context and not add
> extra hacks to the individual parser functions.
> 
> I think that makes sense. I can look into it soon.

Just discussed this issue at my local wiki.

The way you're going to develop this idea is not the same as my proposal. Even if text was written in Klingon and we switched the interface to Greek, the text would still be in Klingon. So, the parser _has_ to know the language it's going to operate if it wanna be fully i18n-friendly as I understand.

Sincerely, ...
Comment 4 Siebrand Mazeland 2010-05-17 18:40:01 UTC
Priyanka, can you please give some insight in your current planning for this issue given your comment "I can look into it soon." almost 7 weeks ago.
Comment 5 Pavel Selitskas [wizardist] 2010-06-05 18:54:36 UTC
I've found an extension I18nTags [1] which is supposed to implement what I proposed. Though I don't actually realize how it works.
Comment 6 Niklas Laxström 2010-09-13 17:40:25 UTC
It shouldn't be too hard to develop a magic word for setting the language for a full page, or tag for marking only sections of page.
Comment 7 Purodha Blissenbach 2011-03-29 10:55:43 UTC
1) Generally it would be advisable to be able set an arbitrary or variable language context. (For example in "Spanish for beginners" in the Spanish Wikiversity, there may be important hints in user preference languages that users already understand, even though the lesson itself builds on Spanish and images)

2) Generally, knowing the current language context may be crucial. (For instance, "$1 user(s)" needs to use different PLURAL rules depending on its language, be it the wiki language, a user language, or any of either fallback languages, or an arbitarily choose one)

3) For some complex or patchwork messages, e.g. "user X copied/moved Y file(s)/dictory(ies)" having GENDER, PLURAl and variable insertions of other messages, when parts are not all localized, it may be even worth knowing the available languages of the pieces so as to find one language for the entire message and not mix bits and pieces from various languages to a non-sentencce or gibberish. (Remember that snippets of different languages may not fit together and may not be suited to replace each other across languages. A simple example: In English, you may have "Uploaded the $1" where $1 is "image" or "video", while French or German (retranslated) has "Uploaded $1" where $1 is "the image" or "the video" since German has two different articles here. Mixed translations will display no article at all, or duplicate articles, respectively)
Comment 8 Niklas Laxström 2011-03-29 11:39:29 UTC
Let's not mix translations in this bag. Setting a language for the whole page would be a start.
Comment 9 Pavel Selitskas [wizardist] 2011-03-29 11:42:16 UTC
(In reply to comment #8)
> Let's not mix translations in this bag. Setting a language for the whole page
> would be a start.

Yes, it's quite easier to implement a whole-page switch instead of my proposed per-call switch. I think it would be enough for Commons.
Comment 10 Purodha Blissenbach 2011-03-30 08:18:16 UTC
(In reply to comment #6)
> It shouldn't be too hard to develop a magic word for setting the language for a
> full page, or tag for marking only sections of page.

Agreed.
When assesing this, please consider scripts and directionalities coming with them. A script having its directionality different from the wikis directionality requires an HTML block element with the correct dir="rtl/ltr" attribute being the container of the language text so as to be rendered correctly.
Comment 11 Pavel Selitskas [wizardist] 2012-01-15 00:28:20 UTC
As WM1.19 rolls out page content language approach, it's a pity that PLURAL and/or GRAMMAR are not yet supported (at least the check fails in labs.wikimedia.beta).
Comment 12 Niklas Laxström 2012-01-15 09:25:23 UTC
What check? They definitely should be supported - there is just no way to set the page content language manually yet.
Comment 13 Pavel Selitskas [wizardist] 2012-01-16 21:56:20 UTC
(In reply to comment #12)
> What check? They definitely should be supported - there is just no way to set
> the page content language manually yet.

I've missed that this trick words for MediaWiki namespace only. That's a shame.

P.S. in MediaWiki everything works just fine, at least in MediaWiki Talk namespace :)
Comment 14 Pavel Selitskas [wizardist] 2013-02-03 23:21:27 UTC
Some obvious things for your consideration:

1. Custom magic word (e.g. {{CONTENTLANG:xx}}) with $parser->mOptions->setTargetLanguage( $language ) makes the desired action, but it doesn't change the actual content language (HTML attributes and other stuff stay the same).

2. Setting $wgContLang directly in a parser hook call makes the desired action, and perhaps makes it too hard. For example, in be-tarask wiki with {{CONTENTLANG:ru}}, [[Катэгорыя:...]] (category) would not work, but the Russian [[Категория:...]] will do. I believe this is not desired at all; on the other hand, at some point this "artefact" may be, let's say, user-friendly in terms of language skills of an editor, but this makes the wiki in its entirety non-maintainable (non-consistend magic word calls, categories, etc).

I've thought of adding a field page.page_lang which would be passed to $wgContLang, but that would mean the same as described in #2 plus additional DB overhead and an interface to control the value (special page?).

It is possible that I didn't work hard enough to hook PageContentLanguage, but normally hooking it in a parser hook call doesn't make much sense due to the nature of the parser, rendering and caching.
Comment 15 Nemo 2013-08-10 19:08:40 UTC
(In reply to comment #0)
> Just learnt about MediaWiki using {{PLURA}} with the base lanuguage only.
> That's enough for general wikis, but a little problem for multilanguage
> projects like Wikimedia Commons.

Wizardist, could you please clarify what's the use case for the original request, disregarding all the later additions? I can't find it anywhere on this report.
Comment 16 Pavel Selitskas [wizardist] 2013-08-10 19:29:10 UTC
(In reply to comment #15)
> (In reply to comment #0)
> > Just learnt about MediaWiki using {{PLURA}} with the base lanuguage only.
> > That's enough for general wikis, but a little problem for multilanguage
> > projects like Wikimedia Commons.
> 
> Wizardist, could you please clarify what's the use case for the original
> request, disregarding all the later additions? I can't find it anywhere on
> this
> report.

This use case was obvious for Wikimedia Commons before Translate extension was widely deployed:

We have a lot of pages in lots of languages in one wiki, and every one of those pages would like to rely on core-driven language functions like plurals or grammar converter, as well as number and dates formatting.

Now, with Translate extension there is no need to bother oneself with this, as every translated page is delivered with a proper content language via a hook in ContentHandler, based on the page title (title/langcode, like as in MediaWiki namespace).

However, theoretically, Translate may not cover every use case, and there could be a need to return to a generic page written in a language different from the default one. I cannot pick an example, but who knows, there may be some of them.

FYI, an allegedly working PoC (language is switched by a direct change in DB there :P ): Id63573a7f
Comment 17 Nemo 2013-08-10 19:32:15 UTC
(In reply to comment #16)
> We have a lot of pages in lots of languages in one wiki, and every one of
> those
> pages would like to rely on core-driven language functions like plurals or
> grammar converter, as well as number and dates formatting.

Sorry, this is still not obvious to me. If PLURAL is being used in Commons templates and so on, why would you want to force it to a language other than the interface language? Is there an example? (Also on another wiki if needed.) Thanks.
Comment 18 Pavel Selitskas [wizardist] 2013-08-10 19:47:14 UTC
(In reply to comment #17)
> (In reply to comment #16)
> > We have a lot of pages in lots of languages in one wiki, and every one of
> > those
> > pages would like to rely on core-driven language functions like plurals or
> > grammar converter, as well as number and dates formatting.
> 
> Sorry, this is still not obvious to me. If PLURAL is being used in Commons
> templates and so on, why would you want to force it to a language other than
> the interface language? Is there an example? (Also on another wiki if
> needed.)
> Thanks.

Hmmmm.... Okay, let me spell this out.

Forget the Translate extension and Special:MyLanguage (correct me if I misspelled this page). Let's go in like 2010+/-.

We have a policy page in English (and the wiki default language is English - all language stuff is supplied by LanguageEn (not correct technically, but I hope you get it)). We have a translation of that policy in Russian (the default wiki language is still English - all language stuff in #mw-content is delivered by LanguageEn).

Did you see that? On that page we would have two plural forms instead of three (or four, like CLDR defines/would like to define), no grammar rules (in English, wuut?), English-style formatting (123,567.123 instead of Russian-style 123 567,123), English-written dates etc. etc.

So I don't get your misunderstanding of how the interface language influences the content language of the page. It just doesn't. When you surf the Commons with English interface, you will unlikely get English contents on the Russian page, right? :) So why on the Russian page should the content language rely on your interface settings?

----

You may be mixing up translated pages with such things like Autotranslate, which exploits a hack of {{int:lang}}. If it's not fixed, then you will continue to be delivered Autotranslate'd templates in your interface message. That's what I can see.

----

Sorry if I don't get the point. Please clarify then,
thanks.
Comment 19 Nemo 2013-08-10 20:09:44 UTC
(In reply to comment #18)
> So I don't get your misunderstanding of how the interface language influences
> the content language of the page. It just doesn't. When you surf the Commons
> with English interface, you will unlikely get English contents on the Russian
> page, right? :) So why on the Russian page should the content language rely
> on
> your interface settings?

The Russian reader is supposed to have Russian interface. If you speak Russian and have Russian interface, and you read a policy page solely written in English, and the page contains a formatted number or a formatted date, then you'll see numbers and dates formatted as Russian in the middle of English text.
This doesn't look like a big problem for me; plus, I still don't see what page could be using PLURAL, that seems even more unlikely.

> 
> ----
> 
> You may be mixing up translated pages with such things like Autotranslate,
> which exploits a hack of {{int:lang}}. If it's not fixed, then you will
> continue to be delivered Autotranslate'd templates in your interface message.
> That's what I can see.

Indeed, you answer by yourself: most of the mixed-language content is created by LangSwitch, LanguageSelect, Autotranslate; the inconsistencies created in single pages by PLURAL and so on are negligible, or rather consistent with the expected inconsistency, :)
If I have a file page with descriptions in n languages, and a date or number, I prefer to have the date and number formatted according to my language so that the 1 description in my interface language is completely correct (and the n-1 other descriptions be inconsistent), rather than having my 1 language wrong, 1 language I don't care about correct and n-2 languages still inconsistent...
Comment 20 Pavel Selitskas [wizardist] 2013-08-10 20:16:41 UTC
Please don't concentrate on private cases. PLURAL was an example, and I'm talking about the whole language tools subset. Step aside from Commons, file pages (why in the hell would we change the language of those???).

Let's wait for some other opinions, because both we don't hear each other. (I state hereby that I'm not sure that I'm right, but I'm ready to defend my view on the problem in next iterations.)
Comment 21 Nemo 2013-08-10 20:53:09 UTC
(In reply to comment #20)
> PLURAL was an example, and I'm
> talking about the whole language tools subset. 

The bug summary is about PLURAL...

> Step aside from Commons, file
> pages (why in the hell would we change the language of those???).
> 
> Let's wait for some other opinions, because both we don't hear each other. (I
> state hereby that I'm not sure that I'm right, but I'm ready to defend my
> view
> on the problem in next iterations.)

I do hear you. I can vaguely imagine use cases myself and comment 18 helps with that, but the aim of this request is totally unclear. Then, again, it might be just me not seeing the obvious, but clarifying would help.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links