Last modified: 2011-03-13 18:06:24 UTC
It would be incredibly useful to add a second contact to nagios to report to #wikimedia-toolserver on freenode when any of the services in the toolserver groups status changes. The groups to monitor: - toolserver_database - ts_ext_store It would really help us to see that there was a server problem etc and know that it's not an issue in our scripts etc. I wasn't sure where to ask, but since it is regarding nagios which is part of the main Wikimedia setup rather than the toolserver setup, this seemed like the most appropriate place to request. Thanks, Matt
Any comments, any idea when/if this will be implemented?
Ask River (flyingparchment). Contact the ts-admins {at] wikimedia.org.
I was under the impression that nagios was a part of the main cluster, and the toolserver staff wouldn't be able to do it without asking a main dev anyway.
(In reply to comment #3) > I was under the impression that nagios was a part of the main cluster, and the > toolserver staff wouldn't be able to do it without asking a main dev anyway. > River is both a toolserver and Wikimedia server root admin. Therefore he would be the best to contact about this, being able to configure it properly on both ends and explain if or if not it is needed.
I have talked to River, and they said to talk to Jeluf. Also, it's only the nagios end that needs configuring.
nagios is not talking to irc. It's some strange script that does this. I don't see any well-maintainable way to tell nagios-wm which events it should report where. I'd recommend to install nagios on one of the toolservers.
I've connected nagios-wm-echo bot to #wikimedia-tech and #wikimedia-toolserver which echoes all toolserver-related nagios-wm messages from #wikimedia-tech to #wikimedia-toolserver.