Last modified: 2011-11-25 07:19:36 UTC
Sometimes a slave server stops replicating, for instance due to some transitory funky error: Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_do_db: Replicate_ignore_db: Last_errno: 1205 Last_error: Error 'Lock wait timeout exceeded; Try restarting transaction' on query. Default database: 'enwiki'. Query: 'UPDATE /* HTMLCacheUpdate::invalidateIDs This flag once ... */ `page` SET page_touched = '20090127180707' WHERE (page_id IN ('14890591'))' In this case, there's no end-user-visible report of lag, but weird things happen such as a failure to show updated information on Special:Contributions. After restarting the slave thread, we get a nice big warning like this: Due to high database server lag, changes newer than 2146 seconds might not be shown in this list. which is neat. It would be nice to have a similar warning if we're pulling from a server that's outright not replicating... it may be difficult to tell how far behind it is in this case, but even a "we're broken" warning would be nice. Note that the lag report in the API shows up "" instead of say "0" for this case: http://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=dbrepllag&sishowalldb whereas the 'lagtop' script reports a 0. Lagtop perhaps should be updated to show a visible warning as well if this is detectable.