Last modified: 2014-11-19 10:23:03 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T36193, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 34193 - Add support for non-Arabic number systems
Add support for non-Arabic number systems
Status: REOPENED
Product: MediaWiki extensions
Classification: Unclassified
ParserFunctions (Other open bugs)
unspecified
All All
: Normal enhancement with 4 votes (vote)
: MW 1.20 version
Assigned To: Nobody - You can work on this!
: patch, patch-reviewed
: 32807 34335 (view as bug list)
Depends on: 14145 19412
Blocks: 34174 40760
  Show dependency treegraph
 
Reported: 2012-02-03 17:55 UTC by Mark A. Hershberger
Modified: 2014-11-19 10:23 UTC (History)
15 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Allow #expr to use non-arabic numerals (1.09 KB, patch)
2012-02-06 04:51 UTC, Mark A. Hershberger
Details
Allow #expr and #time to use non-latin numerals (977 bytes, patch)
2012-02-07 15:24 UTC, Mark A. Hershberger
Details
Month name parsing for core (1.96 KB, patch)
2012-02-14 21:32 UTC, Mark A. Hershberger
Details

Description Mark A. Hershberger 2012-02-03 17:55:37 UTC
We support display of non-Arabic number systems.  We should add support to parser functions for so that "{{#expr: {{CURRENTYEAR}} + 10}}" (see example in Bug 31371) will work.

Ideally, wikitext like "{{#expr: ২ + ৩}}" would work as well. 

https://bn.wikipedia.org/wiki/User:MarkAHershberger/sandbox
Comment 1 Prabhakar Sarma Neog 2012-02-03 18:44:59 UTC

*** This bug has been marked as a duplicate of bug 31371 ***
Comment 2 Prabhakar Sarma Neog 2012-02-03 18:49:29 UTC
let's live with this bug for sometime. I'll work on it eventually.
Comment 3 Mark A. Hershberger 2012-02-03 18:57:47 UTC
This is not a duplicate of Bug 31371 -- it is a request to add functionality to MW that MW does not currently have.  Bug 31371 was a request to disable numeric conversion on aswiki.
Comment 4 Prabhakar Sarma Neog 2012-02-03 19:22:03 UTC
let's live with this bug for sometime. I'll work on it eventually.

*** This bug has been marked as a duplicate of bug 31371 ***
Comment 5 Shiju Alex 2012-02-04 02:14:08 UTC
As already mentioned by Mark A. Hershberger in Comment 3, this bug is not a duplicate of bug 31371. 

And adding this functionality to MediaWiki is not only for Assamese or Bengali language. There are many other languages in this world using non-Arabic numerals. In India itself among Wikipedias atleast Kannada, Gujarati, Oriya, Punjabi, and Marathi are using their own native language scripts. I am sure there are many other languages out side India which require similar support. 

Prabhakar, for the sake of pushing your POV please do not close the bugs. Mark has created this bug to add a major functionality to Mediawiki. And that functionality is very much required.
Comment 6 Jayanta Nath 2012-02-04 04:50:32 UTC
(In reply to comment #0)
> We support display of non-Arabic number systems.  We should add support to
> parser functions for so that "{{#expr: {{CURRENTYEAR}} + 10}}" (see example in
> Bug 31371) will work.
> 
> Ideally, wikitext like "{{#expr: ২ + ৩}}" would work as well. 
> 
> https://bn.wikipedia.org/wiki/User:MarkAHershberger/sandbox


for this {{#expr: {{CURRENTYEAR}} + 10}}
we use {{#expr: {{#time:xnY}} + 10}} in bengali Wikipedia

And for this {{#expr: ২ + ৩}} it will not work . even any unicode numeric number will not work like that.
Comment 7 wikichaipau 2012-02-05 10:17:50 UTC
*** Bug 34174 has been marked as a duplicate of this bug. ***
Comment 8 Prabhakar Sarma Neog 2012-02-05 12:10:23 UTC
Okey Shiju; I am sorry. However, this is not a POV comment. I am a good speaker and reader, well cultured in both Assamese and Bengali and I feel as if both are my mothers. Because of that only my post read like a POV comment.
Comment 9 Mark A. Hershberger 2012-02-06 04:46:11 UTC
(In reply to comment #6)
> for this {{#expr: {{CURRENTYEAR}} + 10}}
> we use {{#expr: {{#time:xnY}} + 10}} in bengali Wikipedia

This is helpful, thanks!

> And for this {{#expr: ২ + ৩}} it will not work . even any unicode numeric
> number will not work like that.

Since we only deal with base10 systems (AFAICT), this is just a small matter of programming.  We have the en->bn mapping for numbers stored in a file:

https://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/messages/MessagesBn.php?revision=110165&view=markup

So changing "২ + ৩" into something that parserfunctions can understand ("2 + 3" in this case) is fairly simple.  Here is an example of some code I just came up with:

  $revert = array_flip($digitTransformTable);  # $revert now contains the
                                               # flipped numeral translation
                                               # from MessagesBn.php
  $bn = "২ + ৩";
  $bnTR = strtr($bn, $revert);
  echo "$bnTR\n";
  echo eval("return $bnTR;"), "\n";

This code prints the following:

  2 + 3
  5

In fact, poking around a bit, most of this functionality is built into MW and just not used in ParserFunctions.

I'll attach a patch to fix this bug for ParserFunctions {{#expr}}, but, after applying it on my local wiki where $wgLanguageCode = "bn", I can create a page with the following:

  {{#expr: ২ + ৩}} <br>
  {{#expr: {{CURRENTYEAR}} + 10 + + ৩}}<br>
  {{CURRENTYEAR}}

And it will display:

  ৫
  ২০২৫
  ২০১২

I *think* this would solve a great deal of the problem.
Comment 10 Mark A. Hershberger 2012-02-06 04:51:02 UTC
Created attachment 9958 [details]
Allow #expr to use non-arabic numerals

We're in the middle of a code slush and this needs review, but it is a start.
Comment 11 Mark A. Hershberger 2012-02-06 05:23:21 UTC
I've put this on http://winkyfrown.com/wiki/ so you can try it out and let me know what you think.
Comment 12 wikichaipau 2012-02-06 11:43:11 UTC
(In reply to comment #11)
> I've put this on http://winkyfrown.com/wiki/ so you can try it out and let me
> know what you think.

That seems to work!  But I want to be sure---you are seeking a solution for all languages and not only for Bengali, right?
Comment 13 Mark A. Hershberger 2012-02-06 23:10:34 UTC
(In reply to comment #12)
> That seems to work!  But I want to be sure---you are seeking a solution for all
> languages and not only for Bengali, right?

The code there will work for any language that MW has a numeral mapping for.  This includes Arabic, several Indic languages and more.
Comment 14 Mark A. Hershberger 2012-02-07 15:23:26 UTC
Updated patch and testwiki with code for #time
Comment 15 Mark A. Hershberger 2012-02-07 15:24:59 UTC
Created attachment 9964 [details]
Allow #expr and #time to use non-latin numerals

update based on test wiki use
Comment 16 Mark A. Hershberger 2012-02-07 15:25:53 UTC
note some use of the test wiki depends on templates that aren't there.  I'll add those later today.
Comment 17 Mark A. Hershberger 2012-02-07 23:31:20 UTC
Added the templates.  All that is needed now is a reverse conversion of month names.
Comment 18 Jayanta Nath 2012-02-08 04:42:16 UTC
Yes non-Arabic number works fine , so need to fix Bug 19412
Comment 19 Mark A. Hershberger 2012-02-13 17:19:14 UTC
*** Bug 34335 has been marked as a duplicate of this bug. ***
Comment 20 Jayanta Nath 2012-02-14 10:32:00 UTC
at http://winkyfrown.com/wiki/ 

{{#time:Y F j|{{{1|{{CURRENTYEAR}}}}}-{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}}}} output shown ২০১২ ফেব্রুয়ারি ১৪(today 2012 February 14) (OK)

But {{#time:F j|{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}} -2 days}}=Error:Invalid time (not OK).

If First one is working fine, why next one is not working?

And others Error as shown in main page.
Comment 21 Mark A. Hershberger 2012-02-14 16:20:04 UTC
(In reply to comment #20)

> But {{#time:F j|{{{2|{{CURRENTMONTH}}}}}-{{{3|{{CURRENTDAY}}}}} -2
> days}}=Error:Invalid time (not OK).
> 
> If First one is working fine, why next one is not working?

If you look at the docs <https://www.mediawiki.org/wiki/Help:Extension:ParserFunctions#.23time> you'll see that this is expected behavior and why
Comment 22 Jayanta Nath 2012-02-14 16:57:06 UTC
Understood but http://en.wikipedia.org/wiki/Wikipedia:Selected_anniversaries/doc finds same issue error but thats out put http://en.wikipedia.org/wiki/Wikipedia:Selected_anniversaries/February_14 is ok
Comment 23 Jayanta Nath 2012-02-14 17:00:46 UTC
Here is test wiki results
http://winkyfrown.com/wiki/index.php/Test_Wiki:Selected_anniversaries/February_14
Comment 24 Mark A. Hershberger 2012-02-14 21:18:52 UTC
I've updated the code on my testwiki to parse month names .... w000! Now, safesubst
Comment 25 Mark A. Hershberger 2012-02-14 21:32:30 UTC
Created attachment 10008 [details]
Month name parsing for core
Comment 26 reza1615 2012-03-01 12:27:50 UTC
after updating to 1.19 it doesn't have any effect on fa.wiki ! specially {{#expre ۱+۱}} shows error
Comment 27 Mark A. Hershberger 2012-03-01 20:38:54 UTC
(In reply to comment #26)
> after updating to 1.19 it doesn't have any effect on fa.wiki ! specially
> {{#expre ۱+۱}} shows error

Yes, sorry.  It was too late to make it into 1.19.
Comment 28 Amir E. Aharoni 2012-03-24 08:34:37 UTC
The patch looks good to me, in the sense that it doesn't
Comment 29 Amir E. Aharoni 2012-03-24 09:17:28 UTC
[The previous comment was submitted by mistake in a very funny way.]

The patch looks good to me, in the sense that it doesn't seem to break anything major. I also tested in Devanagari on my local wiki and it worked. A test page in the live Hindi Wikipedia: https://hi.wikipedia.org/wiki/User:Amire80/native_numbers .

However:
1. (Probably) Small problem: It always runs parseFormattedNumber, even when that is not needed. Maybe it should run it only on wikis that use such numbers.
2. (Somewhat) Larger problem: This is correct if the requirement is to always present the result in the native numbers and if it is considered OK to mix native and non-native numbers. This may be fine, but is there a specification somewhere or is it just a random decision?
Comment 30 Bawolff (Brian Wolff) 2012-03-24 23:06:12 UTC
Note this is a dupe of bug 30318 which is marked wontfixed. I personally think this will break stuff (The expr part. I have no opinion on the date stuff). It will cause expressions to be interpreted totally differently depending on language. Do we really want {{#expr: 10.1+1}} to be either 101 or 11.1 depending on wiki language? It would prevent people from copying templates from different languages.




(Note there is a work around of doing {{#expr: {{FORMATNUM:{{CURRENTYEAR}}|R}} + 10}} for the example use case presented in comment 0)
Comment 31 Bawolff (Brian Wolff) 2012-03-24 23:09:20 UTC
*** Bug 32807 has been marked as a duplicate of this bug. ***
Comment 32 Mark A. Hershberger 2012-03-25 00:03:15 UTC
(In reply to comment #30)
> Do we really want {{#expr: 10.1+1}} to be either 101 or 11.1
> depending on wiki language? It would prevent people from copying templates from
> different languages.

I don't understand how it would prevent copying templates, unless the templates deal with formatting numbers -- and this change would make formatting numbers clearly dependent on the language of the wiki.

I think this is most clearly useful for wikis that target people in India (http://en.wikipedia.org/wiki/Indian_numbering_system).  Since that is an area the WMF is targeting, I think that should be considered.

We target language users by providing wikis in their language so that they're comfortable using them, but, then, when it comes to a basic part of their interaction with the wiki -- numbers and dates -- we require that they adapt themselves to Western conventions.

Sophisticated users are probably fine with the current situation, but from the brief look I've had at discussions on hiwiki and bnwiki, there are a significant number of people there who would like something they feel more comfortable with.

(In reply to comment #29)
> 2. (Somewhat) Larger problem: This is correct if the requirement is to always
> present the result in the native numbers and if it is considered OK to mix
> native and non-native numbers. This may be fine, but is there a specification
> somewhere or is it just a random decision?

Agreed, there should be broader discussion about this.
Comment 33 Bawolff (Brian Wolff) 2012-03-25 01:36:56 UTC
>I don't understand how it would prevent copying templates, unless the templates
>deal with formatting numbers -- and this change would make formatting numbers
>clearly dependent on the language of the wiki.

Well all decimal numbers are "formatted". Some templates need constants in them.

Example: Put this patch on your test wiki. Set wiki language to nl, import [[template:Precision]] from en.wikipedia, watch it break (It gives wrong answers for non-formatted, and gives errors for formatted).

>We target language users by providing wikis in their language so that they're
>comfortable using them, but, then, when it comes to a basic part of their
>interaction with the wiki -- numbers and dates -- we require that they adapt
>themselves to Western conventions.

I wouldn't call using {{#expr a basic part of wiki-editor. Its very easy to create templates using #expr that read formatted numbers so that the average user doesn't have to deal with it. Yes it would be nice if it all magically worked, but i'm worried this introduces further problem. (I suppose one could call {{Formatnum:...}} on every constant in a template, and hence this would just shift responsibility for who has to call formatnum)
Comment 34 bennylin 2012-03-25 07:56:50 UTC
Would the scope of this bug also include [[Chinese numerals]]?
Comment 35 Bawolff (Brian Wolff) 2012-03-25 16:59:36 UTC
(In reply to comment #34)
> Would the scope of this bug also include [[Chinese numerals]]?

At the moment the chinese language files are set to use plain old 0123456789 type numerals. I believe this bug is more about supporting just whatever the default formatted number output for a language is, which does not include [[Chinese numerals]].
Comment 36 Jayanta Nath 2012-03-25 17:58:27 UTC
Ping, Bug 19412 must fixed with corresponding this bug.
Comment 37 Mark A. Hershberger 2012-03-25 19:06:33 UTC
(In reply to comment #34)
> Would the scope of this bug also include [[Chinese numerals]]?

This would be a start to supporting [[Chinese numerals]], but AFAICT,
[[Indian numerals]] are more straight-forward.

That is, I don't think the scope of this bug would cover [[Chinese numerals]], but a bug to support [[Chinese numerals]] would imply that the more straight-forward [[Indian numerals]] are already supported.

I say this because the representation for 12,345,678,902,345 from
[[Chinese numerals]] is not a one-to-one mapping.  Instead it is
十二兆三千四百五十六億七千八百九十萬兩千三百四十五 where in Devanagari it would be 
१२३४५६७८९०२३४५
Comment 38 Siddhartha Ghai 2012-03-25 19:47:45 UTC
(In reply to comment #29)
> [The previous comment was submitted by mistake in a very funny way.]
> 
> The patch looks good to me, in the sense that it doesn't seem to break anything
> major. I also tested in Devanagari on my local wiki and it worked. A test page
> in the live Hindi Wikipedia:
> https://hi.wikipedia.org/wiki/User:Amire80/native_numbers .
> 
> However:
> 1. (Probably) Small problem: It always runs parseFormattedNumber, even when
> that is not needed. Maybe it should run it only on wikis that use such numbers.
> 2. (Somewhat) Larger problem: This is correct if the requirement is to always
> present the result in the native numbers and if it is considered OK to mix
> native and non-native numbers. This may be fine, but is there a specification
> somewhere or is it just a random decision?

Note about the hi-wp subpage that it is set up assuming hi-wp uses devanagari numerals by default. However, hi-wp has default numerals set to arabic numerals, hence it is half incorrect. A subpage on hi-wikt for the response test is [[:w:hi:wikt:User:Siddhartha Ghai/native numbers]]. You'll find that using arabic numerals gives arabic numerals, using devanagari numerals or arabic-devanagari mixed gives errors.
Comment 39 Siddhartha Ghai 2012-03-25 19:51:25 UTC
Note:hi-wikt uses devanagari numerals as default.
Comment 40 Mark A. Hershberger 2012-03-25 20:43:44 UTC
(In reply to comment #39)
> Note:hi-wikt uses devanagari numerals as default.

But devanagari numerals don't work with parser functions which (part of) what this bug is about.

That is {{#expr: १ + १}} returns an error, though the output of
{{#expr: 1 + 1}} would be 2.

I've set up a test on
http://hi.wiktionary.org/wiki/User:MarkAHershberger/bug34193
Comment 41 Siddhartha Ghai 2012-03-27 03:44:13 UTC
(In reply to comment #40)
> But devanagari numerals don't work with parser functions which (part of) what
> this bug is about.

I know. Just wanted to clarify that a test on hi-wp does not provide the correct picture (which can be seen at hi-wikt).

Just to add, its really important from a user perspective to have the ability to use parser functions using non-arabic numerals.
Comment 42 Robin Pepermans (SPQRobin) 2012-12-19 20:03:06 UTC
This doesn't seem like a tracking bug, so removing bug 2007 as depending on this one (and updating title and removing "tracking" keyword).
Comment 43 Mark A. Hershberger 2013-08-28 13:21:35 UTC
See https://en.wikipedia.org/wiki/User_talk:MarkAHershberger#bn:Template:Convert where [[User:Johnuniq]] indicates that he is doing Lua work on this bug.  Maybe making this superfluous?

Certainly, it seems on-wiki control is better (in some sense) than this patch.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links