Last modified: 2013-06-18 15:22:31 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T29851, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 27851 - Wikimedia wikis' job queues need better monitoring
Wikimedia wikis' job queues need better monitoring
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-03-04 20:08 UTC by MZMcBride
Modified: 2013-06-18 15:22 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description MZMcBride 2011-03-04 20:08:33 UTC
Currently the job queues for Wikimedia wikis can become heavily backlogged without anyone noticing. This is bad. Sometimes it's due to not enough job runners being assigned, other times it's due to software problems, etc. The job queue is quite important to MediaWiki, so having it run is important, as is being notified when the job queue has gotten too backlogged or is broken.

A better monitoring and notification system (using mailing lists, IRC, nagios, whatever) needs to be implemented for the job queue. This may relate to bug 27724, though adding a timestamp column is only one way you might implement better monitoring.
Comment 1 Antoine "hashar" Musso (WMF) 2011-03-12 13:22:59 UTC
Raising this bug priority. This is a real issue.
Comment 2 Roan Kattouw 2012-01-03 17:03:33 UTC
This is fixed now. There is a Nagios check which checks job queue length on all wikis (and starting today, this check actually works), see http://nagios.wikimedia.org/nagios/cgi-bin/extinfo.cgi?type=2&host=spence&service=check_job_queue . Ganglia also measures the enwiki job queue length: http://ganglia.wikimedia.org/?m=cpu_report&r=hour&s=descending&c=Miscellaneous+pmtpa&h=spence.wikimedia.org&sh=1

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links