Last modified: 2014-05-07 07:12:03 UTC
Just learnt about MediaWiki using {{PLURA}} with the base lanuguage only. That's enough for general wikis, but a little problem for multilanguage projects like Wikimedia Commons. So is it the way to allow users to specifiy which language to use with PLURAL statement? For example, {{PLURAL:be:{{NUMBER}}|one|for|eight}}. Is it real to be done?
Chatted with Nikerabbit on #mediawiki-i18n about this. He thinks the best way to do this would be to make the parser aware of language context and not add extra hacks to the individual parser functions. I think that makes sense. I can look into it soon.
(In reply to comment #1) > Chatted with Nikerabbit on #mediawiki-i18n about this. He thinks the best way > to do this would be to make the parser aware of language context and not add > extra hacks to the individual parser functions. > > I think that makes sense. I can look into it soon. Yeah, guys at #mediawiki said me that the parser doesn't know which language does the engine operate. Thanks in advance. :)
(In reply to comment #1) > Chatted with Nikerabbit on #mediawiki-i18n about this. He thinks the best way > to do this would be to make the parser aware of language context and not add > extra hacks to the individual parser functions. > > I think that makes sense. I can look into it soon. Just discussed this issue at my local wiki. The way you're going to develop this idea is not the same as my proposal. Even if text was written in Klingon and we switched the interface to Greek, the text would still be in Klingon. So, the parser _has_ to know the language it's going to operate if it wanna be fully i18n-friendly as I understand. Sincerely, ...
Priyanka, can you please give some insight in your current planning for this issue given your comment "I can look into it soon." almost 7 weeks ago.
I've found an extension I18nTags [1] which is supposed to implement what I proposed. Though I don't actually realize how it works.
It shouldn't be too hard to develop a magic word for setting the language for a full page, or tag for marking only sections of page.
1) Generally it would be advisable to be able set an arbitrary or variable language context. (For example in "Spanish for beginners" in the Spanish Wikiversity, there may be important hints in user preference languages that users already understand, even though the lesson itself builds on Spanish and images) 2) Generally, knowing the current language context may be crucial. (For instance, "$1 user(s)" needs to use different PLURAL rules depending on its language, be it the wiki language, a user language, or any of either fallback languages, or an arbitarily choose one) 3) For some complex or patchwork messages, e.g. "user X copied/moved Y file(s)/dictory(ies)" having GENDER, PLURAl and variable insertions of other messages, when parts are not all localized, it may be even worth knowing the available languages of the pieces so as to find one language for the entire message and not mix bits and pieces from various languages to a non-sentencce or gibberish. (Remember that snippets of different languages may not fit together and may not be suited to replace each other across languages. A simple example: In English, you may have "Uploaded the $1" where $1 is "image" or "video", while French or German (retranslated) has "Uploaded $1" where $1 is "the image" or "the video" since German has two different articles here. Mixed translations will display no article at all, or duplicate articles, respectively)
Let's not mix translations in this bag. Setting a language for the whole page would be a start.
(In reply to comment #8) > Let's not mix translations in this bag. Setting a language for the whole page > would be a start. Yes, it's quite easier to implement a whole-page switch instead of my proposed per-call switch. I think it would be enough for Commons.
(In reply to comment #6) > It shouldn't be too hard to develop a magic word for setting the language for a > full page, or tag for marking only sections of page. Agreed. When assesing this, please consider scripts and directionalities coming with them. A script having its directionality different from the wikis directionality requires an HTML block element with the correct dir="rtl/ltr" attribute being the container of the language text so as to be rendered correctly.
As WM1.19 rolls out page content language approach, it's a pity that PLURAL and/or GRAMMAR are not yet supported (at least the check fails in labs.wikimedia.beta).
What check? They definitely should be supported - there is just no way to set the page content language manually yet.
(In reply to comment #12) > What check? They definitely should be supported - there is just no way to set > the page content language manually yet. I've missed that this trick words for MediaWiki namespace only. That's a shame. P.S. in MediaWiki everything works just fine, at least in MediaWiki Talk namespace :)
Some obvious things for your consideration: 1. Custom magic word (e.g. {{CONTENTLANG:xx}}) with $parser->mOptions->setTargetLanguage( $language ) makes the desired action, but it doesn't change the actual content language (HTML attributes and other stuff stay the same). 2. Setting $wgContLang directly in a parser hook call makes the desired action, and perhaps makes it too hard. For example, in be-tarask wiki with {{CONTENTLANG:ru}}, [[Катэгорыя:...]] (category) would not work, but the Russian [[Категория:...]] will do. I believe this is not desired at all; on the other hand, at some point this "artefact" may be, let's say, user-friendly in terms of language skills of an editor, but this makes the wiki in its entirety non-maintainable (non-consistend magic word calls, categories, etc). I've thought of adding a field page.page_lang which would be passed to $wgContLang, but that would mean the same as described in #2 plus additional DB overhead and an interface to control the value (special page?). It is possible that I didn't work hard enough to hook PageContentLanguage, but normally hooking it in a parser hook call doesn't make much sense due to the nature of the parser, rendering and caching.
(In reply to comment #0) > Just learnt about MediaWiki using {{PLURA}} with the base lanuguage only. > That's enough for general wikis, but a little problem for multilanguage > projects like Wikimedia Commons. Wizardist, could you please clarify what's the use case for the original request, disregarding all the later additions? I can't find it anywhere on this report.
(In reply to comment #15) > (In reply to comment #0) > > Just learnt about MediaWiki using {{PLURA}} with the base lanuguage only. > > That's enough for general wikis, but a little problem for multilanguage > > projects like Wikimedia Commons. > > Wizardist, could you please clarify what's the use case for the original > request, disregarding all the later additions? I can't find it anywhere on > this > report. This use case was obvious for Wikimedia Commons before Translate extension was widely deployed: We have a lot of pages in lots of languages in one wiki, and every one of those pages would like to rely on core-driven language functions like plurals or grammar converter, as well as number and dates formatting. Now, with Translate extension there is no need to bother oneself with this, as every translated page is delivered with a proper content language via a hook in ContentHandler, based on the page title (title/langcode, like as in MediaWiki namespace). However, theoretically, Translate may not cover every use case, and there could be a need to return to a generic page written in a language different from the default one. I cannot pick an example, but who knows, there may be some of them. FYI, an allegedly working PoC (language is switched by a direct change in DB there :P ): Id63573a7f
(In reply to comment #16) > We have a lot of pages in lots of languages in one wiki, and every one of > those > pages would like to rely on core-driven language functions like plurals or > grammar converter, as well as number and dates formatting. Sorry, this is still not obvious to me. If PLURAL is being used in Commons templates and so on, why would you want to force it to a language other than the interface language? Is there an example? (Also on another wiki if needed.) Thanks.
(In reply to comment #17) > (In reply to comment #16) > > We have a lot of pages in lots of languages in one wiki, and every one of > > those > > pages would like to rely on core-driven language functions like plurals or > > grammar converter, as well as number and dates formatting. > > Sorry, this is still not obvious to me. If PLURAL is being used in Commons > templates and so on, why would you want to force it to a language other than > the interface language? Is there an example? (Also on another wiki if > needed.) > Thanks. Hmmmm.... Okay, let me spell this out. Forget the Translate extension and Special:MyLanguage (correct me if I misspelled this page). Let's go in like 2010+/-. We have a policy page in English (and the wiki default language is English - all language stuff is supplied by LanguageEn (not correct technically, but I hope you get it)). We have a translation of that policy in Russian (the default wiki language is still English - all language stuff in #mw-content is delivered by LanguageEn). Did you see that? On that page we would have two plural forms instead of three (or four, like CLDR defines/would like to define), no grammar rules (in English, wuut?), English-style formatting (123,567.123 instead of Russian-style 123 567,123), English-written dates etc. etc. So I don't get your misunderstanding of how the interface language influences the content language of the page. It just doesn't. When you surf the Commons with English interface, you will unlikely get English contents on the Russian page, right? :) So why on the Russian page should the content language rely on your interface settings? ---- You may be mixing up translated pages with such things like Autotranslate, which exploits a hack of {{int:lang}}. If it's not fixed, then you will continue to be delivered Autotranslate'd templates in your interface message. That's what I can see. ---- Sorry if I don't get the point. Please clarify then, thanks.
(In reply to comment #18) > So I don't get your misunderstanding of how the interface language influences > the content language of the page. It just doesn't. When you surf the Commons > with English interface, you will unlikely get English contents on the Russian > page, right? :) So why on the Russian page should the content language rely > on > your interface settings? The Russian reader is supposed to have Russian interface. If you speak Russian and have Russian interface, and you read a policy page solely written in English, and the page contains a formatted number or a formatted date, then you'll see numbers and dates formatted as Russian in the middle of English text. This doesn't look like a big problem for me; plus, I still don't see what page could be using PLURAL, that seems even more unlikely. > > ---- > > You may be mixing up translated pages with such things like Autotranslate, > which exploits a hack of {{int:lang}}. If it's not fixed, then you will > continue to be delivered Autotranslate'd templates in your interface message. > That's what I can see. Indeed, you answer by yourself: most of the mixed-language content is created by LangSwitch, LanguageSelect, Autotranslate; the inconsistencies created in single pages by PLURAL and so on are negligible, or rather consistent with the expected inconsistency, :) If I have a file page with descriptions in n languages, and a date or number, I prefer to have the date and number formatted according to my language so that the 1 description in my interface language is completely correct (and the n-1 other descriptions be inconsistent), rather than having my 1 language wrong, 1 language I don't care about correct and n-2 languages still inconsistent...
Please don't concentrate on private cases. PLURAL was an example, and I'm talking about the whole language tools subset. Step aside from Commons, file pages (why in the hell would we change the language of those???). Let's wait for some other opinions, because both we don't hear each other. (I state hereby that I'm not sure that I'm right, but I'm ready to defend my view on the problem in next iterations.)
(In reply to comment #20) > PLURAL was an example, and I'm > talking about the whole language tools subset. The bug summary is about PLURAL... > Step aside from Commons, file > pages (why in the hell would we change the language of those???). > > Let's wait for some other opinions, because both we don't hear each other. (I > state hereby that I'm not sure that I'm right, but I'm ready to defend my > view > on the problem in next iterations.) I do hear you. I can vaguely imagine use cases myself and comment 18 helps with that, but the aim of this request is totally unclear. Then, again, it might be just me not seeing the obvious, but clarifying would help.