Last modified: 2014-08-26 17:48:43 UTC
The replication lags of the database servers should be shown in Ganglia (cf. http://toolserver.org/~bryan/stats/replag/ for the Toolserver counterpart).
As a test, I have set up ~scfc/bin/replagstats to run every minute. The statistics are available at http://ganglia.wmflabs.org/ -> tools -> tools-login -> "Replication Lags metrics".
Isnt this done? http://ganglia.wmflabs.org/latest/graph_all_periods.php?c=tools&h=tools-login&v=29333&m=s5&r=week&z=default&jr=&js=&st=1384294144&z=large Or is there another step?
Looks like it's working fine to me.
(In reply to comment #3) > Looks like it's working fine to me. No, as discussed on IRC, it's still running under my personal account. As it would be useful to show replication lag for every MariaDB slave, I wanted to discuss this as a wider change with Asher. But: a) chance never came about, and b) it's already there! For db1035, go to http://ganglia.wikimedia.org/latest/?c=MySQL%20eqiad&h=db1035.eqiad.wmnet&m=cpu_report&r=hour&s=descending&hc=4&mc=2 and search for "mysql_slave_lag". However, this isn't available for labsdb* yet (cf. http://ganglia.wikimedia.org/latest/?c=MySQL%20eqiad&h=labsdb1001.eqiad.wmnet&m=cpu_report&r=hour&s=descending&hc=4&mc=2), and at the moment can't be enabled anyway as the monitoring for db1035 et al. assumes that only /one/ MariaDB instance runs on any server, while on labsdb* there are several and so mysql_slave_log & Co. need to be prefixed by, for example, "s1_". So to resolve this bug, we need to: a) refactor the monitoring bits and pieces that they handle multiple instances on one server, b) enable such monitoring for labsdb*, and c) create a ganglia::view where *_mysql_slave_lag for labsdb* is combined in one report so that the information isn't scattered over three pages and literally hundreds of graphs.
(I moved ~scfc/bin/replagstats to ~tools.admin/bin/, rewrote it from a cron to a continuous job and started it with jstart.)
(I needed to group the statistics at Ganglia under the virtual host "tools-replags".)