Last modified: 2012-01-08 20:16:05 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T35571, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 33571 - PHP floor( log10() ) gives wrong results and need a replacement


Summary:	PHP floor( log10() ) gives wrong results and need a replacement

Status:	RESOLVED FIXED

Product:	MediaWiki
Classification:	Unclassified
Component:	General/Unknown (Other open bugs)
Version:	1.20.x
Hardware:	All All

Importance:	Unprioritized normal (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:	29169
Blocks:
	Show dependency tree / graph

Reported:	2012-01-06 22:44 UTC by Antoine "hashar" Musso (WMF)
Modified:	2012-01-08 20:16 UTC (History)
CC List:	2 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
test of the C functions (435 bytes, text/plain) 2012-01-07 14:22 UTC, Platonides	Details
Add an attachment (proposed patch, testcase, etc.)

Description Antoine "hashar" Musso (WMF) 2012-01-06 22:44:19 UTC

Language::formatBitrate( 1000000000000000 ) gives '1,000Tbps' when it should be '1Pbps'.

The end result is 'languages/LanguageTest.php' test suite gives a failure:

1) LanguageTest::testFormatBitrate with data set #5 (1000000000000000, '1Pbps', '1 petabit per second')
formatBitrate('1000000000000000'): 1 petabit per second
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'1Pbps'
+'1,000Tbps'


The root cause is using floor() on a float returned by log(). That might be rounded down to the wrong integer :-/

As an example on http://codepad.org/VBkuWR8m

print floor(log10(1000000000000000)) . "\n";
// outputs 15, correct
print floor(log(1000000000000000,10)) . "\n";
// outputs 14 (WRONG)

print floor(log10(1000)) . "\n";
// outputs 3, correct
print floor(log( 1000, 10 )) . "\n";
// outputs 2 (WRONG)


On a Mac with PHP:
PHP 5.3.6 with Suhosin-Patch (cli) (built: Sep  8 2011 19:34:00) 
Copyright (c) 1997-2011 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2011 Zend Technologies
    with Xdebug v2.1.2, Copyright (c) 2002-2011, by Derick Rethans

print floor(log10(1000000000000000)) . "\n";
// outputs 14 (WRONG)
print floor(log(1000000000000000,10)) . "\n";
// outputs 15, correct

print floor(log10(1000)) . "\n";
// outputs 3, correct
print floor(log( 1000, 10 )) . "\n";
// outputs 3, correct


r108284 used log( foo, 10) instead of log10() but that broke the continuous integration system which use Ubuntu :D

Comment 1 Platonides 2012-01-06 23:27:43 UTC

That's not specific to Mac OS.

Comment 2 Sam Reed (reedy) 2012-01-06 23:33:00 UTC

(In reply to comment #1)
> That's not specific to Mac OS.

It seems to only do it on Mac OS... 5.3.8 on windows locally and 5.3.6 on ubuntu don't do it...

Comment 3 Platonides 2012-01-06 23:38:16 UTC

I see it in 5.3.8, 5.2.13, 4.4.9... (x86_64) probably depends of the arithmetic processor.
Is that windows/ubunto Intel or AMD?

Maybe we could always add an appropiate epsilon.

print floor(log(1000000000000000,10)+0.000000000000001);
15
print floor(log(1000000000000000,10)+0.0000000000000001);
14

Comment 4 Sam Reed (reedy) 2012-01-06 23:40:14 UTC

(In reply to comment #3)
> I see it in 5.3.8, 5.2.13, 4.4.9... (x86_64) probably depends of the arithmetic
> processor.
> Is that windows/ubunto Intel or AMD?
> 
> Maybe we could always add an appropiate epsilon.
> 
> print floor(log(1000000000000000,10)+0.000000000000001);
> 15
> print floor(log(1000000000000000,10)+0.0000000000000001);
> 14

Mines Windows 7 x64 Intel



reedy@gallium:~$ uname -a
Linux gallium 2.6.32-35-server #78-Ubuntu SMP Tue Oct 11 16:26:12 UTC 2011 x86_64 GNU/Linux


reedy@gallium:~$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 30
model name      : Intel(R) Xeon(R) CPU           X3450  @ 2.67GHz

Comment 5 Antoine "hashar" Musso (WMF) 2012-01-07 09:39:49 UTC

PHP log() and log10() are not reliable:

php -r 'printf("%.32f\n", log(pow(10,15),10) );'
15.00000000000000355271367880050093

php -r 'printf("%.32f\n", log10(pow(10,15)) );'
14.99999999999999822364316059974954

Both should give exactly 15.


php -r 'printf("%.32f\n", log(pow(10,15)-1,10) );'
15.00000000000000177635683940025046

That last one should give 14.99<something>


So I think it is safer to reimplements floor(log10()) by avoiding float type entirely.

Comment 6 Platonides 2012-01-07 14:22:56 UTC

Created attachment 9821 [details]
test of the C functions

php -r 'printf("%.32f\n", log(pow(10,15)-1,10) );'
14.99999999999999822364316059974954

php -r 'printf("%.32f\n", log10(pow(10,15)) );'
15.00000000000000000000000000000000

php -r 'printf("%.32f\n", log(pow(10,15)-1,10) );'
14.99999999999999822364316059974954



It's more or less normal to have a few differences between machines in floating point behavior http://c-faq.com/fp/strangefp.html

log(x), log10(x) and floor(x) are simple wrappers to the C functions. However, log(x,y) is log(x) / log(10), which is where I am getting the difference.

I think we could instead of floor use round(x, 0, PHP_ROUND_HALF_DOWN);
i.e. round(log(1000000000000000,10), 0, PHP_ROUND_HALF_DOWN) it has a number of checks to get the appropiate value.

Comment 7 Brion Vibber 2012-01-08 20:05:45 UTC

Isn't it pretty normal for floating point numbers to not exactly match decimal representations?

Sounds like a comparison somewhere is assuming that numbers will be exact decimal matches when they actually aren't, and needs to be more liberal in its comparisons.

Comment 8 Brion Vibber 2012-01-08 20:16:05 UTC

Fixed in r108363.

Worked around log10 by simply duplicating/tweaking the very similar code in the function immediately below, which does the same thing with 1024 instead of 1000 unit sizes, uses only simple division, and passes the test cases.

Might want to merge those two into a common function and pass it its common parameters.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links