Last modified: 2014-09-22 22:01:39 UTC
"Server admin log"[1] is updated by morebots (an irc + mediawiki bridge bot run from the wikitech linode instance). Via deployments scripts[2] on fenari, logmsgsbot outputs to irc prefixed with "!log " for morebots to pick up and save to the wiki. If freenode is not responding well or if netsplit separates these two bots from each other (happened again today...), during that time deployments are not logged. Couple options: * Have these bots run on the same server (e.g. both from fenari.wmnet or both from wikitech.linode) * Merge them into 1 bot (which would have to run from fenari in order to hook reliably into deployment scripts) * Have them communicate (also or only) directly to each other instead of via freenode (e.g. via some socket between the two servers). [1] http://wikitech.wikimedia.org/history/Server_admin_log [2] http://wikitech.wikimedia.org/view/bin
There's two separate use cases here: 1) Someone in -operations !log'ing something to the SAL. like "!log I'm going to shutdown tampa, watch out!" 2) Various scripts/tools on the cluster that log things to the SAL eg scap or git-deploy. Can we decouple these two use cases? It seems that passing #2 over an IRC server is bad design for all of the reasons stated. Proposal: * Use case #1 (logmsgbot) should log to Logstash directly in addition to the SAL (initially). * Use case #2 (scap, trebuchet, etc) should log to Logstash directly. * Add support to logmsgbot to announce log entries from Logstash that originated there. * We get rid of the wikitech SAL and create a nice looking public Logstash view. * Use case #1 no longer logs to SAL. No more stupidly large edit histories!
(In reply to Greg Grossmeier from comment #1) > * We get rid of the wikitech SAL and create a nice looking public Logstash > view. This is probably hard given the lack of ACLs in Logstash.