Last modified: 2014-05-06 15:40:19 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T45972, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 43972 - [upstream] Jenkins: MediaWiki unit tests segfault on gallium
[upstream] Jenkins: MediaWiki unit tests segfault on gallium
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Continuous integration (Other open bugs)
unspecified
All All
: Low major (vote)
: ---
Assigned To: Nobody - You can work on this!
: upstream
: 43390 44306 47069 (view as bug list)
Depends on:
Blocks: 45594
  Show dependency treegraph
 
Reported: 2013-01-14 21:54 UTC by Antoine "hashar" Musso (WMF)
Modified: 2014-05-06 15:40 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
backtrace by Tim of a unit test segfault (3.03 KB, text/plain)
2013-01-14 21:56 UTC, Antoine "hashar" Musso (WMF)
Details
backtrace with Zend functions shown (2.65 KB, text/plain)
2013-01-16 13:03 UTC, Antoine "hashar" Musso (WMF)
Details
backtrace with PHP 5.3.10-1ubuntu3.7+wmf1 provided by Alexandros (1.62 KB, text/plain)
2013-09-03 15:38 UTC, Antoine "hashar" Musso (WMF)
Details
2nd backtrace with PHP 5.3.10-1ubuntu3.7+wmf1 (1.90 KB, text/plain)
2013-09-03 15:53 UTC, Antoine "hashar" Musso (WMF)
Details
3rd backtrace with suhosin canary mm disabled (1.35 KB, text/plain)
2013-09-11 23:12 UTC, Alexandros Kosiaris
Details
backtrace of Wikibase tests on travis (4.14 KB, text/plain)
2014-01-02 16:13 UTC, Antoine "hashar" Musso (WMF)
Details

Description Antoine "hashar" Musso (WMF) 2013-01-14 21:54:04 UTC
Change https://gerrit.wikimedia.org/r/#/c/43775/ made against mediawiki/core.git on branch 1.21wmf7, cause our PHPUnit tests to segfault (exit code 139).

Under the misc tests https://integration.mediawiki.org/ci/job/mediawiki-core-phpunit-misc/1244/console :


phpunit-misc:
     [echo] Builddir: /var/lib/jenkins/jobs/mediawiki-core-phpunit-misc/workspace
     [echo] Logdir..: /var/lib/jenkins/jobs/mediawiki-core-phpunit-misc/workspace/logs/
     [echo] Indir...: /var/lib/jenkins/jobs/mediawiki-core-phpunit-misc/workspace/tests/phpunit
     [echo] Opts....: --group Database --exclude-group API,Dump,Parser,Broken,ParserFuzz,Stub -- 
     [exec] PHPUnit 3.7.10 by Sebastian Bergmann.
     [exec]  
     [exec] Configuration read from /var/lib/jenkins/jobs/mediawiki-core-phpunit-misc/workspace/tests/phpunit/suite.xml
     [exec]  
     [exec] .........................................
     [exec] ....................   61 / 5298 (  1%)

BUILD FAILED
/var/lib/jenkins/jobs/_shared/build.xml:452: The following error occurred while executing this line:
/var/lib/jenkins/jobs/_shared/build.xml:473: exec returned: 139



Tim ran the test under gdb and it showed a segfault in preg_match_all() in
PHPUnit_Util_Test::getRequirements(), when running
self::REGEX_REQUIRES. Since we don't seem to use @requires, I just
replaced getRequirements() with "return array()", and then my
changeset passed all tests.

Here's the full backtrace:

Program received signal SIGSEGV, Segmentation fault.
zval_mark_grey (pz=0xa7f82a0) at
/root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c:368
368    /root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c: No such file
or directory.
(gdb) bt
#0  zval_mark_grey (pz=0xa7f82a0) at
/root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c:368
#1  0x00000000006b73ac in zval_mark_grey (pz=<optimized out>)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c:379
#2  0x00000000006b7e75 in gc_mark_roots () at
/root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c:435
#3  gc_collect_cycles () at
/root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c:664
#4  0x00000000006b8174 in gc_zval_possible_root (zv=<optimized out>)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c:166
#5  0x00000000006a7e30 in zend_hash_destroy (ht=0xa7f80f0)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_hash.c:729
#6  0x00000000006994df in _zval_dtor_func (zvalue=0xa7e7598)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_variables.c:46
#7  0x0000000000473c08 in _zval_dtor (zvalue=0xa7e7598)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_variables.h:35
#8  php_pcre_match_impl (pce=0x8fbfcb0,
    subject=0xa7aba48 "/**\n * These tests should work regardless of
$wgCapitalLinks\n * @group Database\n */\n/**\n\t * Make sure
MediaWikiTestCase extending classes have called their\n\t * parent
setUp method\n\t */", subject_len=184, return_value=0xa7ec5e0,
subpats=0xa7e7598, global=1, use_flags=0,
    flags=0, start_offset=0) at
/root/wikimedia/php5/php5-5.3.10/ext/pcre/php_pcre.c:549
#9  0x0000000000473e6b in php_do_pcre_match (ht=3,
return_value=0xa7ec5e0, global=1,
    return_value_ptr=<optimized out>, this_ptr=<optimized out>,
return_value_used=<optimized out>)
    at /root/wikimedia/php5/php5-5.3.10/ext/pcre/php_pcre.c:519
#10 0x000000000070f80d in zend_do_fcall_common_helper_SPEC
(execute_data=0x7ffff7ee1f00)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_vm_execute.h:320
#11 0x00000000006c037b in execute (op_array=0x1d5f6c0)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_vm_execute.h:107
#12 0x000000000068d8bc in zend_call_function (fci=0x7fffffffba60,
fci_cache=<optimized out>)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_execute_API.c:969
#13 0x00000000005d0178 in zif_call_user_func_array (ht=<optimized
out>, return_value=0xa722870,
    return_value_ptr=<optimized out>, this_ptr=<optimized out>,
return_value_used=<optimized out>)
    at
/root/wikimedia/php5/php5-5.3.10/ext/standard/basic_functions.c:4803
#14 0x000000000070f80d in zend_do_fcall_common_helper_SPEC
(execute_data=0x7ffff7edf5c0)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_vm_execute.h:320
#15 0x00000000006c037b in execute (op_array=0x901c008)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend_vm_execute.h:107
#16 0x000000000069b8e0 in zend_execute_scripts (type=8, retval=0x0,
file_count=3)
    at /root/wikimedia/php5/php5-5.3.10/Zend/zend.c:1308
#17 0x0000000000647f53 in php_execute_script (primary_file=0x7fffffffe1d0)
    at /root/wikimedia/php5/php5-5.3.10/main/main.c:2323
#18 0x000000000042c797 in main (argc=10, argv=0x7fffffffe3d8)
    at /root/wikimedia/php5/php5-5.3.10/sapi/cli/php_cli.c:1188
Comment 1 Antoine "hashar" Musso (WMF) 2013-01-14 21:56:28 UTC
Created attachment 11627 [details]
backtrace by Tim of a unit test segfault

Above backtrace attached in a text file for convenience.
Comment 2 Antoine "hashar" Musso (WMF) 2013-01-14 21:58:36 UTC
We probably want to further isolate the unit test that cause that issue
and report them upstream (PHP and PHPUnit) and make it easier to reproduce.  If it backtrace, we might want to try out another PHP version / some nightly.
Comment 3 Tim Starling 2013-01-14 23:48:21 UTC
I think recompiling PHP with the bundled PCRE source rather than the system library would be the first thing to try. Faidon may be able to help with that. If that doesn't fix it, then you could try PHP 5.3.x git head.

It's probably best to try a different PHP version before you isolate and report the issue, since the folks at bugs.php.net are unlikely to be interested in a segfault in a package they don't maintain.

If the bug isn't present in the latest 5.3.x, then it will probably be our responsibility to fix or work around it.
Comment 4 Antoine "hashar" Musso (WMF) 2013-01-15 09:12:53 UTC
High priority since this is blocking merge in wmf branches and several people complained about it since yesterday.
Comment 5 Antoine "hashar" Musso (WMF) 2013-01-15 10:46:48 UTC
I have removed the hack in PHPUnit and upgraded it to  3.7.13. Running out of a local copy works for me as well as using the workspace of change 44039 which did segfault :/
Comment 6 Antoine "hashar" Musso (WMF) 2013-01-16 11:54:36 UTC
ah I manage to reproduce the segfault from time to time using Gerrit change #44221 patchset 1.

Command used:
 WORKSPACE=/home/hashar/core JOB_NAME=testing_segfault_job_name ant -file /var/lib/jenkins/jobs/_shared/build.xml phpunit-databaseless
Comment 7 Antoine "hashar" Musso (WMF) 2013-01-16 12:42:30 UTC
SELF NOTE:

On gallium I did:

# My private clone of mediawiki
cd ~/core/tests/phpunit
# Apply change 44221 patchset 1:
git fetch https://gerrit.wikimedia.org/r/mediawiki/core refs/changes/21/44221/1 && git checkout -b 44221/1 FETCH_HEAD
# change to the subdir, apparently running from the root directory
# of the working copy does not trigger the segfault (or i havent tried enough)
cd tests/phpunit
# run gdb:
gdb --args php phpunit.php --conf /home/hashar/core/LocalSettings.php --exclude-group Database,Broken,ParserFuzz,Stub --log-junit /home/hashar/core/logs/junit.xml  --; echo $?
(gdb) run
# wait for segfault

Program received signal SIGSEGV, Segmentation fault.
zval_mark_grey (pz=0xa196d18) at /root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c:368
368	/root/wikimedia/php5/php5-5.3.10/Zend/zend_gc.c: No such file or directory.

# Ask for a backtrace:
(gdb) bt
# snip backtrace, which is above already.
Comment 8 Antoine "hashar" Musso (WMF) 2013-01-16 12:45:15 UTC
The line #2 of the backtrace reference gc_mark_roots, googling for PHP segfault gc_mark_roots gives out https://bugs.php.net/bug.php?id=63055 which has the same backtrace when running test suite for Drupal and or Symfony2.

Laruence __ php.net says:
 any usage of zval_dtor with recursive array may trigger this segfault.

We indeed see a call to _zval_dtor in our backtrace (line #7).
Comment 9 Antoine "hashar" Musso (WMF) 2013-01-16 13:03:36 UTC
Created attachment 11638 [details]
backtrace with Zend functions shown

Using the .gdbinit from PHP, I found out what Tim found ages ago, aka that is caused by a preg_match_all()


(gdb) source /home/hashar/gdbinit
(gdb) zbacktrace

0x7ffff7ee37c8] preg_match_all("/@requires\s+(?P<name>function|extension)\s(?P<value>([^\40]+))\r?$/m", "\12/**\12\11\40*\40@dataProvider\40provideWfMatchesDomainList\12\11\40*/", array(7)[0xa196c78]) /usr/share/php/PHPUnit/Util/Test.php:125 
[0x7ffff7ee32d8] PHPUnit_Util_Test::getRequirements("GlobalTest", "testWfMatchesDomainList") /usr/share/php/PHPUnit/Framework/TestCase.php:557 
[0x7ffff7ee2a00] PHPUnit_Framework_TestCase->setRequirementsFromAnnotation() /usr/share/php/PHPUnit/Framework/TestCase.php:585 
[0x7ffff7ee12c0] PHPUnit_Framework_TestCase->checkRequirements() /usr/share/php/PHPUnit/Framework/TestCase.php:822 
[0x7fffffffbab0] PHPUnit_Framework_TestCase->runBare() 
[0x7ffff7ee0e88] call_user_func_array(array(2)[0x9ad9a28], array(0)[0xa195ff8]) /usr/share/php/PHP/Invoker.php:93 
[0x7ffff7edf4c0] PHP_Invoker->invoke(array(2)[0x9ad9a28], array(0)[0xa195ae0], 2) /usr/share/php/PHPUnit/Framework/TestResult.php:646 
[0x7ffff7ede140] PHPUnit_Framework_TestResult->run(object[0x2334d50]) /usr/share/php/PHPUnit/Framework/TestCase.php:769 
[0x7ffff7edd438] PHPUnit_Framework_TestCase->run(object[0x9225990]) /home/hashar/core/tests/phpunit/MediaWikiTestCase.php:116 
[0x7ffff7edd320] MediaWikiTestCase->run(object[0x9225990]) /usr/share/php/PHPUnit/Framework/TestSuite.php:775 
[0x7ffff7edbb10] PHPUnit_Framework_TestSuite->runTest(object[0x2334d50], object[0x9225990]) /usr/share/php/PHPUnit/Framework/TestSuite.php:745 
[0x7ffff7eda2e8] PHPUnit_Framework_TestSuite->run(object[0x9225990], false, array(0)[0x9225d70], array(4)[0x9225d20], false) /usr/share/php/PHPUnit/Framework/TestSuite.php:705 
[0x7ffff7ed8ac0] PHPUnit_Framework_TestSuite->run(object[0x9225990], false, array(0)[0x9225d70], array(4)[0x9225d20], false) /usr/share/php/PHPUnit/Framework/TestSuite.php:705 
[0x7ffff7ed7298] PHPUnit_Framework_TestSuite->run(object[0x9225990], false, array(0)[0x9533340], array(4)[0x95334f0], false) /usr/share/php/PHPUnit/Framework/TestSuite.php:705 
[0x7ffff7ed45b0] PHPUnit_Framework_TestSuite->run(object[0x9225990], false, array(0)[0x95343a0], array(4)[0x9534550], false) /usr/share/php/PHPUnit/TextUI/TestRunner.php:346 
[0x7ffff7ed39f8] PHPUnit_TextUI_TestRunner->doRun(object[0x1b4adf8], array(7)[0x9535338]) /usr/share/php/PHPUnit/TextUI/Command.php:176 
[0x7ffff7ed3800] PHPUnit_TextUI_Command->run(array(10)[0x3638e90], false) /home/hashar/core/tests/phpunit/MediaWikiPHPUnitCommand.php:61 
[0x7ffff7ed34b0] MediaWikiPHPUnitCommand->run(array(10)[0x3639e68], true) /home/hashar/core/tests/phpunit/MediaWikiPHPUnitCommand.php:47 
[0x7ffff7ed3068] MediaWikiPHPUnitCommand::main() /home/hashar/core/tests/phpunit/phpunit.php:107
Comment 10 Antoine "hashar" Musso (WMF) 2013-01-16 13:13:20 UTC
Upstream bug PHP #63055 https://bugs.php.net/bug.php?id=63055
Comment 11 Antoine "hashar" Musso (WMF) 2013-01-16 13:22:38 UTC
Tim proposed to use a different PHP version and or PECL version. According to upstream bug 63055, the bug is in PHP-5.4.x as well so I have reinstated Tim live hack to PHPUnit:

vim /usr/share/php/PHPUnit/Util/Test.php

     public static function getRequirements($className, $methodName)
    {
        // HASHAR TIM hack bug https://bugzilla.wikimedia.org/43972
        return array();
    ...
    }


That is a workaround for the bug.
Comment 12 Tim Landscheidt 2013-01-16 16:28:41 UTC
*** Bug 43390 has been marked as a duplicate of this bug. ***
Comment 13 Antoine "hashar" Musso (WMF) 2013-01-17 19:04:30 UTC
Lowering priority since we have applied a workaround
Comment 14 Antoine "hashar" Musso (WMF) 2013-01-24 17:35:36 UTC
*** Bug 44306 has been marked as a duplicate of this bug. ***
Comment 15 Antoine "hashar" Musso (WMF) 2013-01-24 17:38:05 UTC
Upstream bug apparently got solved http://git.php.net/?p=php-src.git;a=commit;h=ccc519b7a92bfe4b191c0e2e3869516171247ac2 

That commit is in:


$ git branch -r --contains ccc519b7a92bfe4b191c0e2e3869516171247ac2
  origin/HEAD -> origin/master
  origin/PHP-5.4
  origin/PHP-5.4.10
  origin/PHP-5.4.11
  origin/PHP-5.4.9
  origin/PHP-5.5
  origin/immutable-date
  origin/master

So I guess PHP >= 5.4.9 is fine :-)
Comment 16 Antoine "hashar" Musso (WMF) 2013-01-24 17:40:42 UTC
and PHP >= 5.3.19
Comment 17 Antoine "hashar" Musso (WMF) 2013-02-11 10:45:39 UTC
Moving bug back in poll. This will be fixed whenever we upgrade to PHP 5.3.19+
Comment 18 Antoine "hashar" Musso (WMF) 2013-04-09 08:58:03 UTC
Got another occurrence when running the full test suite:

https://integration.wikimedia.org/ci/job/mediawiki-core-master-phpunit-all/1454/consoleFull

/var/lib/jenkins/jobs/_shared/build.xml:437: The following error occurred while executing this line:
/var/lib/jenkins/jobs/_shared/build.xml:482: exec returned: 139
Comment 19 Antoine "hashar" Musso (WMF) 2013-04-09 08:59:27 UTC
I can confirm the workaround described in Comment #11 is still present. So we must have yet another segfault issue :(
Comment 20 Antoine "hashar" Musso (WMF) 2013-04-10 08:46:50 UTC
*** Bug 47069 has been marked as a duplicate of this bug. ***
Comment 21 Antoine "hashar" Musso (WMF) 2013-05-21 09:04:46 UTC
Pinged ops-l list about it.  Seems to me we want to cherry-pick the upstream change in our PHP package.
Comment 22 Antoine "hashar" Musso (WMF) 2013-05-25 14:19:54 UTC
RT https://rt.wikimedia.org/Ticket/Display.html?id=5209
Comment 23 Antoine "hashar" Musso (WMF) 2013-08-01 12:53:10 UTC
No activity on RT, I have pinged it.
Comment 24 Antoine "hashar" Musso (WMF) 2013-09-03 14:30:15 UTC
Alexandros provided some new packages. I have manually installed them on gallium:


dpkg -i \libapache2-mod-php5_5.3.10-1ubuntu3.7+wmf1_amd64.deb \php5-cli_5.3.10-1ubuntu3.7+wmf1_amd64.deb \php5-common_5.3.10-1ubuntu3.7+wmf1_amd64.deb \php5-curl_5.3.10-1ubuntu3.7+wmf1_amd64.deb \
php5-dbg_5.3.10-1ubuntu3.7+wmf1_amd64.deb \
php5-dev_5.3.10-1ubuntu3.7+wmf1_amd64.deb \
php5-gd_5.3.10-1ubuntu3.7+wmf1_amd64.deb \
php5-intl_5.3.10-1ubuntu3.7+wmf1_amd64.deb \
php5-mysql_5.3.10-1ubuntu3.7+wmf1_amd64.deb \
php5-pgsql_5.3.10-1ubuntu3.7+wmf1_amd64.deb \
php5-sqlite_5.3.10-1ubuntu3.7+wmf1_amd64.deb \
php5-tidy_5.3.10-1ubuntu3.7+wmf1_amd64.deb
Comment 25 Antoine "hashar" Musso (WMF) 2013-09-03 14:32:33 UTC
I have retriggered the code coverage job which was segfaulting: https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/
Comment 26 Antoine "hashar" Musso (WMF) 2013-09-03 15:38:22 UTC
Created attachment 13224 [details]
backtrace with PHP 5.3.10-1ubuntu3.7+wmf1 provided by Alexandros
Comment 27 Antoine "hashar" Musso (WMF) 2013-09-03 15:53:20 UTC
Created attachment 13225 [details]
2nd backtrace with PHP 5.3.10-1ubuntu3.7+wmf1

Another backtrace.  zbacktrace has now clue, phpbt yields:

No symbol "execute_data" in current context.
Comment 28 Antoine "hashar" Musso (WMF) 2013-09-04 09:18:41 UTC
PHPUnit 3.7.22 includes a workaround for https://bugs.php.net/bug.php?id=63055
Comment 29 Antoine "hashar" Musso (WMF) 2013-09-09 21:20:46 UTC
phpunit 3.7.24 has been deployed last week on gallium.

I am upgrading the PHP packages to keep them in sync with production. That get rid of Alexandros PHP patches but since PHPUnit has a workaround, that should be fine.

Retriggering the coverage job at https://integration.wikimedia.org/ci/job/mediawiki-core-code-coverage/
Comment 30 Gerrit Notification Bot 2013-09-11 21:28:58 UTC
Change 83940 had a related patch set uploaded by Hashar:
disable suoshin mem handler for code coverage

https://gerrit.wikimedia.org/r/83940
Comment 31 Alexandros Kosiaris 2013-09-11 23:12:38 UTC
Created attachment 13273 [details]
3rd backtrace with suhosin canary mm disabled

After running the job with SUHOSIN_MM_USE_CANARY_PROTECTION=0 disabling suhosin's mm there was a different bt. Attaching it here.
Comment 32 Antoine "hashar" Musso (WMF) 2013-10-18 08:56:31 UTC
PHP still segfaults but it happens very late in PHP execution (during shutdown), so the HTML is actually generated and published at https://integration.wikimedia.org/cover/mediawiki-core/master/php/
Comment 33 Gerrit Notification Bot 2013-10-23 13:52:19 UTC
Change 83940 abandoned by Hashar:
disable suoshin mem handler for code coverage

Reason:
does not prevent PHP from segfaulting ..

https://gerrit.wikimedia.org/r/83940
Comment 34 Marius Hoch 2014-01-02 14:56:01 UTC
Just want to note that Wikibase also has troubles with phpunit on PHP 5.3.27 (on travis-ci).

Backtrace:
http://pastebin.com/Me7zsvmk
Comment 35 Antoine "hashar" Musso (WMF) 2014-01-02 16:13:01 UTC
Created attachment 14213 [details]
backtrace of Wikibase tests on travis

Attaching to bug the backtrace pasted at http://pastebin.com/Me7zsvmk
Comment 36 Gerrit Notification Bot 2014-02-28 13:19:22 UTC
Change 116093 had a related patch set uploaded by Hashar:
Coverage now ignore phpunit ignores

https://gerrit.wikimedia.org/r/116093
Comment 37 Gerrit Notification Bot 2014-02-28 13:24:28 UTC
Change 116093 merged by jenkins-bot:
Coverage now ignore phpunit ignores

https://gerrit.wikimedia.org/r/116093
Comment 38 Andre Klapper 2014-04-25 06:52:48 UTC
All patches merged; resetting ticket status
Comment 39 Antoine "hashar" Musso (WMF) 2014-05-06 15:40:19 UTC
There is no more segfaults happening.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links