Last modified: 2014-10-06 19:42:06 UTC
It seems that when we launch all the browser test builds that use headless Firefox at the same time, we are stressing the performance capabilities of the Jenkins host that supports all of those xvfb sessions. The symptoms are most visible in a build like this one: https://integration.wikimedia.org/ci/view/BrowserTests/job/browsertests-Flow-en.wikipedia.beta.wmflabs.org-linux-firefox/ Note that the builds at the 3:45 and 6:43 marks are the ones kicked off automatically along with all the other browser test builds. The failures are due to errors like unable to bind to locking port 7054 within 45 seconds/undefined method `close' for nil:NilClass (NoMethodError) (Firefox fails to start) Also unable to obtain stable firefox connection in 60 seconds (127.0.0.1:7055) too many connection resets (due to Timeout::Error - Timeout::Error) after 286 requests on 23958180, last used 60.017253158 seconds ago Builds started manually don't seem to have these kinds of problems launching the browser or connecting to it or getting responses.
Chris: Is this still occuring for headless Fx after the throttling Antoine imposed?
from Chris on IRC: "we tried headless firefox and brought the Jenkins host to its knees. we are 100% SauceLabs at this point" My question is moot then.
For an unrelated change, I have eventually found some time to look at Xvfb and had a look at the headless ruby gem. In short: we have a race condition in mediawiki_selenium gems which cause the Xvfb on port 99 to be killed by another running in parallel. The fix is to allocate a different display port or stop killing the xvfb :-D That is filled as Bug 71602 - mediawiki_selenium always use the same default xvfb display 99