Last modified: 2011-08-08 09:32:48 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T30492, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 28492 - Write wmf replag ircbot
Write wmf replag ircbot
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
IRC (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Krinkle
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-04-11 17:19 UTC by Krinkle
Modified: 2011-08-08 09:32 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Krinkle 2011-04-11 17:19:58 UTC
Like toolserver's replag bot would get it's data from the api:

  action=query&meta=siteinfo&siprop=dbrepllag

Commands somewhat like:

[#wikimedia-tech] <Krinkle>: @replag
[#wikimedia-tech] <wmfreplag>: [s1] db26: 6; [s5] db14: 1, db35: 1

[#wikimedia-tech] <Krinkle>: @replag all
[#wikimedia-tech] <wmfreplag>: [s1] db36: 0, db32: 0, db12: 0, db26: 0, db38: 0; [s2] db13: 0, db30: 0, db24: 0; [s4] db31: 0, db22: 0, db33: 0;
[#wikimedia-tech] <wmfreplag>: [s5] db23: 0, db14: 0, db35: 0; [s6] db29: 0, db21: 0, db7: 0; [s7] db37: 0, db18: 0, db16: 0;

[#wikimedia-dev] <Krinkle>: @replag s4
[#wikimedia-dev] <wmfreplag>: [s4] db31: 0, db22: 0, db33: 0

[#wikimedia-dev] <Krinkle>: @replag db36
[#wikimedia-dev] <wmfreplag>: db36: 0 (s1)

[#wikimedia-dev] <Krinkle>: @replag commonswiki
[#wikimedia-dev] <wmfreplag>: [commonswiki: s4] db31: 0, db22: 0, db33: 0


Info like dbserver-numbers, server-clusternumebrs and wikidb-names will be periodically fetched from: Wikimedia's conf/db.php [1]

This is basically a reminder for myself right now, although I haven't started on this yet so anyone who feels like it. Go ahead and assign it to yourself :-)


--
Krinkle


--
Krinkle

[1]
 http://noc.wikimedia.org/conf/highlight.php?file=db.php
 http://noc.wikimedia.org/conf/db.php.txt
Comment 1 Sam Reed (reedy) 2011-04-11 17:23:51 UTC
Can we do this in a saner way for say all, rather than just hitting an API page on each cluster...?
Comment 2 Krinkle 2011-04-11 17:32:06 UTC
(In reply to comment #1)
> Can we do this in a saner way for say all, rather than just hitting an API page
> on each cluster...?

Based on the info from db.php it would only have to make 1, 2 or 7 http requests depending on the IRC command. Note that this I do not intend to create a bot that warns when replag is too high (in other words, it would not make any requests while idling) - since that is probably something that should be catched serverside and would indicate a larger issue.

Although it could ofcourse check 'all' silently once every 15 minutes and report anything out of the ordinary, not that big a deal.
Comment 3 Bawolff (Brian Wolff) 2011-04-11 22:04:17 UTC
Stupid question (I'm just curious) - if its not going to check repetitively in case things go wrong, whats the use case for knowing the replag? If its big enough to make a difference, I'd imagine that'd fall in the category of something gone wrong.
Comment 4 Krinkle 2011-04-11 22:07:26 UTC
(In reply to Bawolff comment #3)
> Stupid question (I'm just curious) - if its not going to check repetitively in
> case things go wrong, whats the use case for knowing the replag?

(In reply to Krinkle comment #2)
> Although it could ofcourse check 'all' silently once every 15 minutes and
> report anything out of the ordinary, not that big a deal.

Okay, it *will* check periodically!
Comment 5 p858snake 2011-04-11 23:11:30 UTC
(In reply to comment #4)
> (In reply to Bawolff comment #3)
> > Stupid question (I'm just curious) - if its not going to check repetitively in
> > case things go wrong, whats the use case for knowing the replag?
> 
> (In reply to Krinkle comment #2)
> > Although it could ofcourse check 'all' silently once every 15 minutes and
> > report anything out of the ordinary, not that big a deal.
> 
> Okay, it *will* check periodically!

Um, doesn't the nagios bot already report this in channel if it goes too high?
Comment 6 Krinkle 2011-04-18 00:28:10 UTC
A basic start has been made.

Booted it for a test run in #wikimedia-dev, #wikimedia-tech, #wmfDbBot.

Account: wmfDbBot

Right now it doesn't do the periodic checks and nagging yet. Just on-demand to see if it is wanted or not.

Current supported commands:

@info <id>
@replag <id>

id:
- cluster: (s1-s7; @info also supports 'DEFAULT')
- dbhost (ie. db18)
- dbname (ie. enwiki, dewiktionary; @info also supports 'centralauth')


"@replag" without arguments will check all hosts and only return those that have a replag higher than 1 second (or alternatively, "No replag").

"@replag all" will check all clusters and return all their dbhosts+lag counts.
Comment 7 Krinkle 2011-04-23 23:49:34 UTC
(In reply to comment #5)
> (In reply to Bawolff comment #3)
> > Stupid question (I'm just curious) - if its not going to check repetitively in
> > case things go wrong, whats the use case for knowing the replag?
> 
> Um, doesn't the nagios bot already report this in channel if it goes too high?

I have never seen it do that. Can someone verify this ?
Comment 8 Sam Reed (reedy) 2011-04-24 09:43:30 UTC
AFAIK I'm sure it doesn't...
Comment 9 Krinkle 2011-08-08 09:32:48 UTC
Marking as fixed.

It's been running for a while and works nicely.

Source code for bot: https://svn.toolserver.org/svnroot/krinkle/trunk/Kribo/
wmf-replag backend + bridge to Kribo-bot: https://svn.toolserver.org/svnroot/krinkle/trunk/Kribo%20(plugins)/wmfDbBot_KriboBridge/

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links