Last modified: 2014-09-02 04:42:20 UTC
BetaLabs is awesome. It is catching a lot of breakages that would otherwise hit users. We're grateful of it. What are the specific limitations with BetaLabs that is preventing us from whole-heartedly trusting a breakage on BetaLabs as a blocker for wider deployment? Either mark those are blockers of this bug or report them and mark them as blockers :-)
I understand it generally runs master rather than the current deployment branch, so it's not really useful for testing changes which happen outside of the normal CI cycle. It uses Varnish for text instead of Squid, exposing known bugs that do not occur in production. It apparently uses a different set of extensions to production. It uses a different deployment system to production, which makes it difficult to reproduce bugs related to non-atomic code tree update.
(In reply to comment #1) > I understand it generally runs master rather than the current deployment > branch, so it's not really useful for testing changes which happen outside of > the normal CI cycle. Right (if I'm understanding you correctly) this (beta cluster) won't catch things that aren't first merged to master for some amount of time before being on the production cluster. > It uses Varnish for text instead of Squid, exposing known bugs that do not > occur in production. Unfortunate, but hopefully the switch of production text to varnish will happen soon enough. Do you think it is worth it to switch Beta Cluster (back?) to Squid for the time being? I guess that depends on how long the Varnish text transition will end up taking... > It apparently uses a different set of extensions to production. Some of this is by design (eg: Flow), but I'm curious now which extensions differ and why... > It uses a different deployment system to production, which makes it difficult > to reproduce bugs related to non-atomic code tree update. Right, maybe the wording of "true canary for code deployments" wasn't the best. Maybe, "true canary for production"? Deploying will be different on Beta Cluster until/when/if production moves to a Continuous Deployment system, no way to get around that. Luckily, experience with Beta Cluster should help inform that transition.
(In reply to comment #1) > I understand it generally runs master rather than the current deployment > branch, so it's not really useful for testing changes which happen outside of > the normal CI cycle. When we created beta the aim was to catch bugs before they land in wmf branches. We used test.wikipedia.org to test out wmf branch before syncing. Maybe we could set up some more wiki that would use the wmf branches as well. > It uses Varnish for text instead of Squid, exposing known bugs that do not > occur in production. That followed a discussion I had with Mark over IRC. Since text varnish was (and is) going to land in production it seemed like a good idea to play test on beta. We did discover a few bugs and I think it helped move varnish text forward. I would prefer we do not revert back to squid, its configuration is not handled via puppet and I dont think it is worth the effort. > It apparently uses a different set of extensions to production. There might be some differences. IIRC CheckUser has been explicitly disabled. But if an extension is missing we should add it in and configure it for beta. > It uses a different deployment system to production, which makes it difficult > to reproduce bugs related to non-atomic code tree update. We use a shared NFS export (/data/project) which is where deployment-bastion (aka tin) and the apaches/jobrunner are reading files from. So we just git pull and have instant deploy, just like we used to do a while ago with Zwinger. Back in January 2013, we had git-deploy on beta to stage it before deploying in production. The project is apparently stalled and had some issues with labs so we reverted back to the NFS share. With Sartoris apparently getting some attention, the people working on it could well migrate beta to Sartoris. Additionally, the reason we are not using scap is that it depends on debian package and a myriad of puppet changes. I don't have merge right on operations/puppet.git and eventually got fed up trying to get change merged in, so I just abandoned the idea of using scap.