Last modified: 2009-12-28 16:52:01 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T23946, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 21946 - Sorted wikitables do not properly handle minus signs
Sorted wikitables do not properly handle minus signs
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Normal minor (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-12-24 20:47 UTC by Ozob
Modified: 2009-12-28 16:52 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
patch against 60371 (1.54 KB, patch)
2009-12-24 22:23 UTC, Conrad Irwin
Details

Description Ozob 2009-12-24 20:47:44 UTC
Sorted wikitables of type currency do not recognize minus signs. In wikibits.js, ts_currencyToSortKey could be changed from

  return ts_parseFloat(s.replace(/[^0-9.,]/g,''));

to

  return ts_parseFloat(s.replace(/[^-0-9.,]/g,''));

and this would half-fix the problem. But it does not fully fix the problem, because this recognizes the hyphen, -, but not the HTML minus sign, −. Columns of type numeric do not recognize minus signs, either. An example of the latter bug can be viewed at:

http://en.wikipedia.org/w/index.php?title=Wikipedia:Arbitration_Committee_Elections_December_2009&diff=prev&oldid=332579916 (Broken sort using minus signs)
ttp://en.wikipedia.org/w/index.php?title=Wikipedia:Arbitration_Committee_Elections_December_2009&diff=next&oldid=332579916 (Working sort using hyphens)

Sorting on minus signs in columns of type numeric could be fixed by going to ts_parseFloat and changing

  num = parseFloat(s.replace(/,/g, ""));

to

  num = parseFloat(s.replace(/,/g, "")).replace(/−/gi, "-").replace(/&(?:minus|#x0*2212|#0*8722);/gi, "-")

which would convert HTML minus signs to hyphens before attempting to parse the number; but this would not handle minus signs in currency values, because they would be removed by ts_currencyToSortKey before ts_parseFloat is called.

A more comprehensive solution to this is to substitute characters for entity references in ts_resortTable before the preprocessor is called (or maybe even before the preprocessor is chosen). To fix the bugs with minus signs it would suffice to convert minus sign references as above, but it may be desirable to convert all entity references.
Comment 1 Conrad Irwin 2009-12-24 22:23:38 UTC
Created attachment 6901 [details]
patch against 60371

* allows U+2212 (MINUS SIGN) in place of - in numbers (but not dates).
* allows a space between the minus sign and the number.
* allows a minus sign in a currency (before or after the initial currency marker)
* sorts non numerics in number columns as -Infinity instead of 0 (I assume that all at one end was the intention)
Comment 2 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-12-24 23:03:29 UTC
Looks good.  Committed as r60376, thanks.
Comment 3 Thana 2009-12-27 06:47:46 UTC
(In reply to comment #1)
> Created an attachment (id=6901) [details]
> patch against 60371
> 
> * allows U+2212 (MINUS SIGN) in place of - in numbers (but not dates).
> * allows a space between the minus sign and the number.
> * allows a minus sign in a currency (before or after the initial currency
> marker)
> * sorts non numerics in number columns as -Infinity instead of 0 (I assume that
> all at one end was the intention)
> 

This patch uses [+-\u2212] which will match anything from U+002B PLUS SIGN to
U+2212 MINUS SIGN. Need to escape the hyphen as it is no longer adjacent to the
brackets. Best practice would be to include the backslash regardless. Escaping
the plus sign to avoid confusion would not hurt either, thus [\+\-\u2212].
Comment 4 Tony Souter 2009-12-27 09:00:48 UTC
I tried adding U+2212 to a two-digit numeral in a table: doesn't work.

I must say I strongly support Ozob's filing of this bug; I do hope we can find a way to use the proper symbols in tables. 
Comment 5 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-12-27 15:48:08 UTC
(In reply to comment #3)
> This patch uses [+-\u2212] which will match anything from U+002B PLUS SIGN to
> U+2212 MINUS SIGN. Need to escape the hyphen as it is no longer adjacent to the
> brackets. Best practice would be to include the backslash regardless. Escaping
> the plus sign to avoid confusion would not hurt either, thus [\+\-\u2212].

Good catch.  Regex is fun.  Fixed in r60430.

(In reply to comment #4)
> I tried adding U+2212 to a two-digit numeral in a table: doesn't work.

The fix has been committed to trunk.  It isn't live on Wikipedia yet, that will happen who knows when.  Note that there's currently no reliable way of telling what revision Wikipedia is at without poking through SVN logs.  It's currently at r57447, I think, and has been since early October.
Comment 6 Tony Souter 2009-12-27 15:51:06 UTC
What, three thousand "revisions" behind? To a tech-moron like me, it sounds strange. But I believe you. Let's hope they do a thousand in a stroke. Thanks.
Comment 7 Happy-melon 2009-12-27 21:51:42 UTC
Nope, that's standard practice, especially with the tech team being so short-staffed at this time.  Scaps are usually several thousand revisions at a time.
Comment 8 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-12-27 22:02:10 UTC
They didn't used to be.  A year or two ago we had scaps every week or so.  Hopefully we'll return to those halcyon days in the imminent future, but until then we are where we are.
Comment 9 Thana 2009-12-28 16:46:24 UTC
> Note that there's currently no reliable way of telling what revision
> Wikipedia is at without poking through SVN logs.  It's currently at
> r57447, I think, and has been since early October.

[[Special:Version]] says r59858. Is that not reliable?
Comment 10 Happy-melon 2009-12-28 16:48:53 UTC
Nope. :-D
Comment 11 Tony Souter 2009-12-28 16:52:01 UTC
It should be tagged as such, then. I'm all for common WPs knowing just a little of the big picture, the basics, of the techie side.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links