Last modified: 2014-11-19 21:11:37 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T47499, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 45499 - Jenkins: Run jobs in disposable VMs
Jenkins: Run jobs in disposable VMs
Status: NEW
Product: Wikimedia
Classification: Unclassified
Continuous integration (Other open bugs)
unspecified
All All
: High normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 53978 51556 53594
Blocks: 43266
  Show dependency treegraph
 
Reported: 2013-02-27 17:04 UTC by Antoine "hashar" Musso (WMF)
Modified: 2014-11-19 21:11 UTC (History)
12 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Antoine "hashar" Musso (WMF) 2013-02-27 17:04:21 UTC
We need to run unit tests under Jenkins whenever someone submit a new patchset regardless of the submitter status (ie a potential attacker).  To achieve that, the Jenkins job need to be properly isolated.

Following an internal WMF meeting, we could setup virtual machines using vagrant. They would be booted on a second box and controlled by Jenkins vagrant plugin.
Comment 1 Antoine "hashar" Musso (WMF) 2013-03-11 05:01:53 UTC
Still need to be planned. We had a quick meeting involving Chad, Chris, Ori and Željko.


First steps:
 - get a new box and have it isolated from production
 - write Jenkins slave manifest

Later on:
 - migrate CloudBee to our system
 - get Jenkins isolated on the new box
 - have Jenkins slave and tests in vagrant machines

Objective: get test to run for anyone submitting a change in Gerrit.


App Armor :
 - restrict a process. Like which files are accessed
 - might be hard to scale / maintain since we will have to update it
constantly.
 - applied on a global basis. Can't really used as a wrapper for unit tests.

Vagrant:
 - you can reproduce the issue
 - vm can access to a specific directory of its host, ideal to share
 - base image used to boot
 - then apply a chef/puppet manifest


The steps would thus be:
- setup the Vagrant Jenkins plugin in labs
- prepare a Vagrant box that will be able to receive code, run tests and submit back to Jenkins
- figure out whether the Vagrant virtual instance should run on a second (and network isolated) server.
Comment 2 Krinkle 2013-04-09 05:27:00 UTC
I'm not sure I follow comment 1. What is the plan? There seem to be some contradicting parts between here and past announcements and even within the above comment.

From what I recall the plan is to:

Preparation:
* Set up a VM that is provisioned with everything it needs to start executing tests (basic LAMP stack)
* Pause the VM and store it as a clone-able image. This image now takes nothing but disk space.

Flow for future jobs:
* Be able to quickly make a copy of the based image, which is then unpaused through vagrant (no need for the vm to actually boot as it was written to disk in an already hot state)
* Have Jenkins access this VM to set up workspace there instead and run the tests further as usual.

Of course this comes at a slight cost of overhead, so we'll have to optimise our Zuul configuration to not be as insane as it is now where practically every command is a separate build step in a single-build-step job. Concurrency is nice, but this is doing more harm than good.

And of course separate from that, if we only use a single VM and do everything there then it can easily be compromised and we still wouldn't get anywhere. The whole point of executing these tests in a VM is so that we can *finally* obnoxious system we have now where tests are run after code review.
Comment 3 Antoine "hashar" Musso (WMF) 2013-05-06 20:43:51 UTC
Diederik from analytics mentioned they could use that for Limm. It is a node.js application that rely on a lot of npm modules.  The Parsoid team has a similar request, they currently go around that by having the tests to run in a labs instance, jenkins is merely hitting a webservice in labs and waits for the result.
Comment 4 Faidon Liambotis 2013-05-07 04:13:16 UTC
I don't think AppArmor is a secure way to run sandboxes, or that it was designed for this purpose. Using it as an extra layer is a good idea, though.

I don't know Vagrant at all but my impression is that it spawns VMs. If so, that sounds reasonable from a security point of view, depending on the technology (KVM?) and access granted to that VM (network in particular).
Comment 5 Antoine "hashar" Musso (WMF) 2013-09-10 05:50:38 UTC
This need jobs to be runnable on any Jenkins slave which is bug 53594
Comment 6 Greg Grossmeier 2013-11-04 18:01:45 UTC
(In reply to comment #0)
> We need to run unit tests under Jenkins whenever someone submit a new
> patchset
> regardless of the submitter status (ie a potential attacker).  To achieve
> that,
> the Jenkins job need to be properly isolated.

To be explicit, another useful benefit of doing this:

This would allow us to turn on Jenkins for all repos (we currently don't because we're afraid of an evil test owning Jenkins). This would provide our developer community a more responsive experience when working on new extensions, for example.
Comment 7 Antoine "hashar" Musso (WMF) 2013-11-13 16:35:09 UTC
Most of the infrastructures and jobs are now roaming between the two production slaves.  We should thus be able to get slaves in a dedicated labs project.
Comment 8 Diederik van Liere 2013-11-13 17:10:46 UTC
Is docker.io maybe a solution to consider to quickly spin up a virtual image for a specific app? Docker is an open-source project to easily create lightweight, portable, self-sufficient containers from any application but much faster than for example virtual box.
Comment 9 Greg Grossmeier 2013-11-13 17:53:45 UTC
(In reply to comment #8)
> Is docker.io maybe a solution to consider to quickly spin up a virtual image
> for a specific app? Docker is an open-source project to easily create
> lightweight, portable, self-sufficient containers from any application but
> much faster than for example virtual box.

Not sure why'd we'd use docker instead of our already in-place and maintained WMF Labs infrastructure (which is OpenStack). Care to elaborate?
Comment 10 Matthew Flaschen 2013-11-13 22:51:58 UTC
Also, docker does not create a full guest machine the way virtualization software does.  We would need to evaluate if it provided sufficient isolation for the security requirements.
Comment 11 Krinkle 2014-01-03 14:50:02 UTC
Speed is not a concern since we'll have OpenStack maintain a pool of fresh and hot standby nodes. Each node is destroyed after use, and new ones are spun up ahead of time.

The pool itself will be maintained using copies of written-to-disk images of already provisioned virtual machines in their paused state. So it only takes copying of the image, and unpausing it, and maybe a few commands to register it.
Comment 12 Krinkle 2014-01-03 16:35:49 UTC
For the record, Antoine and I have the following in mind for now:

The plan described earlier is still on, except that, instead of Vagrant, we'll use the existing OpenStack infrastructure in place at Wikimedia Labs. In addition, the idea is to have a pool of fresh hot standby instances, so it'll be even faster in practice than merely copying/unpausing/connecting, which can still be a bit of a slowdown.
Comment 13 Faidon Liambotis 2014-01-03 16:40:04 UTC
I experimented in the past with minijail0, from the Chromium/Chrome project. It might be worthwhile to experiment some more, it will certainly be much much faster than anything VM-related.
Comment 14 Antoine "hashar" Musso (WMF) 2014-02-10 09:43:26 UTC
(In reply to comment #13)
> I experimented in the past with minijail0, from the Chromium/Chrome project.
> It might be worthwhile to experiment some more, it will certainly be much much
> faster than anything VM-related.

The documentation is sparse :(  I am not sure how it manage to isolate a process, apparently it works using a chroot which I am not sure how we can set it up :(
Comment 15 Krinkle 2014-11-19 21:11:37 UTC
There have been discussions about this on engineering and ops mailing list as well as some write ups and comparison on Google Docs.

What was the outcome of this? Are we going with an OpenStack-based solution where we'll have a VM image and an automatically maintained pool of hot stand-by machines ready for use, that will be accessed by Jenkins over ssh to perform any commands (like we currently do for the persistent slaves), and destroyed afterwards.

The image can probably be very basic (perhaps even the same as for main labs, just plain Ubuntu Precise or Trusty), and instead use puppet to provision it. Since they'll be spun up ahead of time, it's okay for it to take a while to be provisioned. That way we won't have to deal with creating new images.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links