Last modified: 2013-11-22 22:00:36 UTC
gitblit is hosted on antinomy.wikimedia.org which only have the default checks: puppet freshness, NTP and SSH. https://icinga.wikimedia.org/cgi-bin/icinga/status.cgi?search_string=antimony It would need a check that monitor whether gitblit is running. templates/icinga/nrpe_local.cfg.erb has a bunch of examples if you look for 'java'. An example for Jenkins: command[check_jenkins]=/usr/lib/nagios/plugins/check_procs -w 1:1 -c 1:1 --ereg-argument-array '^/usr/bin/java .*-jar /usr/share/jenkins/jenkins.war' Which make sure there is one and only one java process with jenkins.war.
Widening summary, want this for Gerrit too.
Change 75777 had a related patch set uploaded by Demon: Add icinga monitoring for Gerrit and Gitblit https://gerrit.wikimedia.org/r/75777
Setting importance to High cause, well, gerrit ang gitblit have been needing a bit of hand holding lately.
Pinged Chad / Leslie by email to move this forward.
Unassigning from Chad. We need someone with Puppet and firewall rule writing expertise to finish this off.
Filled https://rt.wikimedia.org/Ticket/Display.html?id=6342 to apply the ferm system on the servers hosting Gerrit/Gitblit (antinomy and manganese) and enable monitoring ( https://gerrit.wikimedia.org/r/#/c/75777/ ).
Change 75777 merged by Akosiaris: Add icinga monitoring for Gerrit and Gitblit https://gerrit.wikimedia.org/r/75777
This is now in place, all checks look green. Should get appropriate alerts now when things go down badly :)
Thank you everyone! Alexandros, Ariel and David Zahn have been very helpful adding the ferm firewall configuration. Hurrah! Ideally we would validate the monitoring are working properly by shutting down gitblit and Gerrit and confirm warnings are issued. But I might be too meticulous.