Last modified: 2014-02-24 11:28:30 UTC
When sending out a translation notification from Meta, I received a "Wikimedia Foundation error" with the following error message: Request: POST http://meta.wikimedia.org/wiki/Special:NotifyTranslators, from 208.80.154.77 via cp1012.eqiad.wmnet (squid/2.7.STABLE9) to 10.64.0.142 (10.64.0.142) Error: ERR_READ_TIMEOUT, errno [No Error] at Sun, 06 Oct 2013 09:51:22 GMT The notification itself seems to have made it through alright - I have seen talk page and email messages that went out, and the log entry reads as follows: 09:51, 6 October 2013 Tbayer (WMF) (talk | contribs | block) sent a notification about translating page Wikimedia Highlights, August 2013; languages: all languages; deadline: none; priority: low; sent to 1639 recipients, failed for 0 recipients, skipped for 0 recipients This is a followup to https://bugzilla.wikimedia.org/show_bug.cgi?id=41131 ("Timeout when sending translation notification"), which was closed in February 2013. Filing it as a new bug per Nikerabbit's advice at https://bugzilla.wikimedia.org/show_bug.cgi?id=53769#c12 .
This just happened again in the same way (error 503, but log entry, talk page and email messages look OK): Request: POST http://meta.wikimedia.org/wiki/Special:NotifyTranslators, from 208.80.154.134 via cp1052 frontend ([10.2.2.25]:80), Varnish XID 850123795 Forwarded for: 192.195.83.38, 208.80.154.134 Error: 503, Service Unavailable at Fri, 15 Nov 2013 06:09:31 GMT 06:09, 15 November 2013 Tbayer (WMF) (talk | contribs | block) sent a notification about translating page Wikimedia Highlights, October 2013; languages: all languages; deadline: none; priority: medium; sent to 1632 recipients, failed for 0 recipients, skipped for 24 recipients
Translation Notifications currently submits a job for each recipient during the web request itself. 1632 is a lot of jobs, especially if they're going to multiple wikis. I think if it wrapped them in a submit job like MassMessage does (see https://github.com/wikimedia/mediawiki-extensions-MassMessage/blob/master/MassMessageSubmitJob.php), it should be faster and hopefully get rid of the timeouts.
This (or something very similar with the same cause) just happened to me too, btw. I only submitted once and I didn't even get a Wikimedia error, just a 504 Gateway Time-out page, but in my case the notification was sent three times... https://meta.wikimedia.org/w/index.php?title=Special:Log&dir=prev&offset=20131115060948&limit=3&type=notifytranslators&user= 10:06, 24 November 2013 Nemo bis (talk | contribs) sent a notification about translating page User:MediaWiki message delivery; languages: all languages; deadline: 2013-12-31; priority: medium; sent to 1099 recipients, failed for 0 recipients, skipped for 563 recipients 10:05, 24 November 2013 Nemo bis (talk | contribs) sent a notification about translating page User:MediaWiki message delivery; languages: all languages; deadline: 2013-12-31; priority: medium; sent to 1442 recipients, failed for 0 recipients, skipped for 220 recipients 10:05, 24 November 2013 Nemo bis (talk | contribs) sent a notification about translating page User:MediaWiki message delivery; languages: all languages; deadline: 2013-12-31; priority: medium; sent to 1446 recipients, failed for 0 recipients, skipped for 216 recipients
Change 97370 had a related patch set uploaded by Legoktm: Use batch submission of jobs https://gerrit.wikimedia.org/r/97370
It would be helpful if we had some profiling data on the special page. My guess is that it's the pushing of jobs into the queue, but it could be something else. There's a db write for every user who gets sent a notification, which could also be expensive.
Change 97370 merged by jenkins-bot: Use batch submission of jobs https://gerrit.wikimedia.org/r/97370
I9f06dcef91a35dd8b7fe75271b26682d94db3d20 will also probably help.
Should be resolved now that Gerrit change #97370 has been merged.
*** Bug 57896 has been marked as a duplicate of this bug. ***