Last modified: 2014-03-18 09:23:47 UTC
Right now icinga spews out a huge blob of json when there is an Elasticsearch problem. That is difficult to read.
Also we should warn if there are ever fewer than 3 lucene indexes active per shard.
It'd be nice if this could detect a split brain as well. It'd be really nice if this warned on the elasticsearch cluster as a whole rather than individual hosts.... It should still complain if it can't read a host but not once per host once for issues that affect the whole cluster.
From Antoine: There is a plugin to monitor clusters. Use case, doc, examples at: http://docs.icinga.org/latest/en/clusters.html https://www.nagios-plugins.org/doc/man/check_cluster.html The idea is to create a service that is based on the result of other services.
Removing from the list of bugs required to reenable Cirrus as it was really for ops and ops doesn't seem to be jumping up and down about it. I'm leaving it filed as NORMAL and I've got the process started. We'll get this, but not before next week.
Is this bug (and its friends in see also) a blocker for expanding Cirrus on the wikis which were already indexed? It would be really nice to make it default on, say, all Wiktionaries or all Wikiquotes and see what happens to the load.