Last modified: 2014-11-07 19:23:52 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T45936, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 43936 - High priority jobs like enotifs are executed very slowly
High priority jobs like enotifs are executed very slowly
Status: UNCONFIRMED
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
wmf-deployment
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: code-update-regression, performance, testme
Depends on:
Blocks: 1932
  Show dependency treegraph
 
Reported: 2013-01-13 20:15 UTC by Nemo
Modified: 2014-11-07 19:23 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Nemo 2013-01-13 20:15:26 UTC
Email notifications used to be instant, now they're taking about 20 minutes on en.wiki (with only 60 thousands jobs in the queue).
Not only this shows some problem with the job queue system and it's a non-small regression, but it's also very confusing because I'm sent notifications when they're already obsoleted (for instance because I already replied).

----

Dear Nemo bis,

The Wikipedia page User talk:Nemo bis has been changed on 13 January 2013
by anonymous user 76.126.142.118, see
http://en.wikipedia.org/wiki/User_talk:Nemo_bis for the current
revision. 

See
http://en.wikipedia.org/w/index.php?title=User_talk:Nemo_bis&diff=next&oldid=532890436
to view this change.

----

Received: from imp-3.mail.tiscali.it (10.39.115.235) by mx-3-it.mail.tiscali.it (8.5.148)
        id 50BF36D0094B0EEF for <redacted>@tiscali.it; Sun, 13 Jan 2013 21:01:44 +0100
Received: from wiki-mail.wikimedia.org ([208.80.152.133])
	by imp-3.mail.tiscali.it with 
	id nY1j1k02z2swdko01Y1kqf; Sun, 13 Jan 2013 21:01:44 +0100
x-cnfs-analysis: v=2.0 cv=RYES+iRv c=1 sm=2 a=P51sRyCuLXUxWMHwWK9oAA==:17
 a=eIhxMilvRf8A:10 a=z82XInz0jxkA:10 a=RyZ8rIAjjLkA:10 a=eztASiHJGFwA:10
 a=IkcTkHD0fZMA:10 a=3GbmggnxAAAA:8 a=8pif782wAAAA:8 a=d2uY_mg3cpUA:10
 a=nk0ike9KCJb9eP9e8BIA:9 a=QEXdDO2ut3YA:10 a=c7XZu54lUV4A:10
 a=9vCFg7g2Nj6V2bzh:21 a=HUl_rzNbRn9v3Gf1:21 a=P51sRyCuLXUxWMHwWK9oAA==:117
Received: from mw8.pmtpa.wmnet ([10.0.11.8]:57845)
	by mchenry.wikimedia.org with esmtp (Exim 4.69)
	(envelope-from <wiki@wikimedia.org>)
	id 1TuTkG-0003E4-Fs
	for <redacted>@tiscali.it; Sun, 13 Jan 2013 20:01:28 +0000
Received: from apache by mw8.pmtpa.wmnet with local (Exim 4.76)
	id 1TuTkG-0008Ux-Bg
	for <redacted>@tiscali.it; Sun, 13 Jan 2013 20:01:28 +0000
To: Nemo bis
Subject: Wikipedia page User talk:Nemo bis has been changed by anonymous user 76.126.142.118
From: MediaWiki Mail <wiki@wikimedia.org>
Reply-To: reply@not.possible
Date: Sun, 13 Jan 2013 20:01:28 +0000
MIME-Version: 1.0
Content-type: text/plain; charset=UTF-8
Content-transfer-encoding: 8bit
Message-ID: <enwiki.50f3129856d1c5.83285442@en.wikipedia.org>
X-Mailer: MediaWiki mailer
Comment 1 Andre Klapper 2013-01-21 14:24:57 UTC
One day when https://ganglia.wikimedia.org will be accessible again I could even look at the JobQueue graph...

Nemo, is the lag of ~20min still a problem?

/me looking at https://gerrit.wikimedia.org/r/#/q/project:mediawiki/core+-owner:L10n-bot+message:jobqueue,n,z
Comment 2 Nemo 2013-01-21 18:09:04 UTC
Job queue is now under 2000 or so on en.wiki, so it looks like the wrong timing to try to reproduce this bug. https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo&siprop=statistics
Anyway next time you can ask on my user talk and I'll compare timestamps of edit and enotif. :-)
Comment 3 Nemo 2013-03-13 17:35:48 UTC
Should probably raise severity because it takes now hours to receive an enotif from mediawiki.org (job queue 0 now, ~20 at 14 CET): 15:22–17:05 in the example.

Received: from wiki-mail.wikimedia.org ([208.80.152.133])
	by imp-2.mail.tiscali.it with 
	id B55A1l00w2swdko0155BbL; Wed, 13 Mar 2013 18:05:11 +0100
x-cnfs-analysis: v=2.0 cv=KYdQQHkD c=1 sm=2 a=P51sRyCuLXUxWMHwWK9oAA==:17
 a=gbdniXhMvlMA:10 a=RyZ8rIAjjLkA:10 a=cNjpVsleRgUA:10 a=eztASiHJGFwA:10
 a=IkcTkHD0fZMA:10 a=3GbmggnxAAAA:8 a=4P5xif6CAAAA:8 a=KcaC6ams3nQA:10
 a=mdTHgZqYbhYL0A32_hcA:9 a=QEXdDO2ut3YA:10 a=4wRdB16iIHwA:10
 a=P51sRyCuLXUxWMHwWK9oAA==:117
Received: from mw1003.eqiad.wmnet ([10.64.0.33]:38380)
	by mchenry.wikimedia.org with esmtp (Exim 4.69)
	(envelope-from <wiki@wikimedia.org>)
	id 1UFnVc-00068m-87
	for <redacted>; Wed, 13 Mar 2013 15:22:28 +0000
Received: from apache by mw1003.eqiad.wmnet with local (Exim 4.76)
	id 1UFnVc-00075V-19
	for <redacted>; Wed, 13 Mar 2013 15:22:28 +0000
To: Nemo bis <redacted>
Subject: MediaWiki page Help:Extension:Translate/Configuration has been changed by Nikerabbit
From: MediaWiki Mail <wiki@wikimedia.org>
Reply-To: reply@not.possible
Date: Wed, 13 Mar 2013 15:22:28 +0000
Comment 4 Nemo 2013-03-27 19:40:34 UTC
If bug 46603 is right, Site requests is the correct component.
If it's just a jobqueue problem and mail relay doesn't factor into it, perhaps we just have too much stuff in "high priority"?
Comment 5 Nemo 2013-05-02 18:00:13 UTC
Currently it's basically instant, no time (1 s? unless Date is wrong) spent on apaches and about 20 s between mchenry.wikimedia.org and wiki-mail.wikimedia.org.
Global jobqueue very low around 100k, will check again when it gets higher.
https://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Miscellaneous%20pmtpa&h=hume.wikimedia.org&v=823574&m=Global_JobQueue_length&r=hour&z=default&jr=&js=&st=1365625056&z=large
Comment 6 Aaron Schulz 2013-06-25 21:25:31 UTC
Closing
Comment 7 Nemo 2013-09-12 17:24:56 UTC
Reopening: we have reports that password reminders on en.wiki take 60 minutes to arrive.
I can't think of any reason other than this bug; global job queue is reportedly around 2 millions. <https://ganglia.wikimedia.org/latest/graph_all_periods.php?c=Miscellaneous%20pmtpa&h=hume.wikimedia.org&v=823574&m=Global_JobQueue_length&r=month&z=default&jr=&js=&st=1365625056&z=large>
Comment 8 Aaron Schulz 2013-09-17 17:42:19 UTC
From graphite, none of the job queue push/pop graphs look remarkable over the last 2 months. The are lots of Parsoid jobs though (about 2 million on enwiki).
Comment 9 MZMcBride 2013-09-20 22:54:59 UTC
(In reply to comment #7)
> Reopening: we have reports that password reminders on en.wiki take 60 minutes
> to arrive.

Link(s)?

> I can't think of any reason other than this bug; global job queue is
> reportedly around 2 millions.

There are apparently different queues.
Comment 10 Nemo 2013-09-20 23:00:55 UTC
(In reply to comment #9)
> (In reply to comment #7)
> > Reopening: we have reports that password reminders on en.wiki take 60 minutes
> > to arrive.
> 
> Link(s)?

Nope. Reported on #wikimedia-tech, relayed from #wikipedia-en-help I think.

> 
> > I can't think of any reason other than this bug; global job queue is
> > reportedly around 2 millions.
> 
> There are apparently different queues.

Yes (and it would be good to raise the concurrency for high priority jobs, they're still at 6 and used to be 8 till April IIRC) but this doesn't mean they don't affect each other; it happened in the past e.g. with bug 42614.
Comment 11 Andre Klapper 2014-03-15 23:24:53 UTC
Nemo / MZ: Are you aware of any recent issues (as I'm not)? 
This might end up as WORKSFORME now...
Comment 12 Andre Klapper 2014-10-16 12:49:19 UTC
Is anybody aware of any recent issues (as I'm not) or is this WORKSFORME now?
Comment 13 Andre Klapper 2014-11-07 14:00:41 UTC
Last call: Is anybody aware of any recent issues (as I'm not) or is this WORKSFORME now?
Comment 14 Nemo 2014-11-07 19:23:52 UTC
This bug can only be tested when the job queue is very high.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links