Last modified: 2014-05-07 09:30:55 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T66622, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 64622 - Error generating thumbnail: As an anti-spam measure, you are limited from performing this action too many times
Error generating thumbnail: As an anti-spam measure, you are limited from per...
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Media storage (Other open bugs)
wmf-deployment
All All
: Highest major with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
: 64801 (view as bug list)
Depends on:
Blocks: Wikisource 41371
  Show dependency treegraph
 
Reported: 2014-04-29 19:30 UTC by Yann Forget
Modified: 2014-05-07 09:30 UTC (History)
24 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Comment 1 billinghurst 2014-04-29 23:33:06 UTC
Comments:
* Purging files at Commons has no effect
* Clicking the "Other resolutions:" gives the error

Error generating thumbnail
Error creating thumbnail: File missing 

* Full image appears to display okay
https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/Revue_des_Deux_Mondes_-_1843_-_tome_3.djvu/page970-2840px-Revue_des_Deux_Mondes_-_1843_-_tome_3.djvu.jpg
Comment 2 Yann Forget 2014-04-30 05:30:03 UTC
Again, many pages on this file: https://fr.wikisource.org/wiki/Page:Harlez_-_Avesta,_livre_sacr%C3%A9_du_Zoroastrisme.djvu/934
Comment 3 Gerrit Notification Bot 2014-04-30 06:53:34 UTC
Change 130563 had a related patch set uploaded by Aaron Schulz:
Removed "GetLocalFileCopy" pool counter entry

https://gerrit.wikimedia.org/r/130563
Comment 4 Gerrit Notification Bot 2014-04-30 06:53:54 UTC
Change 130563 merged by jenkins-bot:
Removed "GetLocalFileCopy" pool counter entry

https://gerrit.wikimedia.org/r/130563
Comment 5 Andre Klapper 2014-04-30 14:54:52 UTC
Aaron: You are fast. Thank you!
Comment 6 Yann Forget 2014-05-03 15:38:36 UTC
Similar issue again now: I get a message "Error generating thumbnail

As an anti-spam measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes."

in at least about one in every 3 pages.
Comment 8 Nemo 2014-05-03 18:13:42 UTC
*** Bug 64801 has been marked as a duplicate of this bug. ***
Comment 9 Nemo 2014-05-03 18:14:46 UTC
Changing summary; the error is widespread across all sorts of users of Commons.
Comment 10 Andre Klapper 2014-05-03 18:19:31 UTC
555: Resetting blocker and immediate; see [[mw:Bugzilla/Fields#Priority]]
Comment 11 Maik Wiege 2014-05-03 18:22:49 UTC
I as well get currently frequent error 500's after requesting a thumbnail image:
"Error generating thumbnail - As an anti-spam measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes."
This happens quite fast (I requested perhaps around 100 thumbnails in the last few hours). But it also resolves quite fast. Retrying it shortly after, usually results in a OK 200.
Comment 12 Yann Forget 2014-05-04 15:54:15 UTC
It seems this is getting more and more frequent. When is a fix expected? Thanks.
Comment 13 Jesús Martínez Novo (Ciencia Al Poder) 2014-05-04 15:59:50 UTC
This seems to be hitting $wgRateLimits['renderfile']. See [1]. Those rate limits are disabled by default, so maybe WMF has set them up recently.

-----

[1] https://www.mediawiki.org/wiki/Manual:$wgRateLimits
Comment 14 Nemo 2014-05-04 17:20:46 UTC
(In reply to Jesús Martínez Novo (Ciencia Al Poder) from comment #13)
> This seems to be hitting $wgRateLimits['renderfile']. See [1]. Those rate
> limits are disabled by default, so maybe WMF has set them up recently.

$ git blame InitialiseSettings.php | grep -A 4 renderfile
c78a54c9 (Aaron Schulz             2013-10-16 16:14:35 -0700  6390)             'renderfile' => array(
02f3863a (Aaron Schulz             2014-01-21 12:40:42 -0800  6391)                     // 1400 new thumbnails per minute
02f3863a (Aaron Schulz             2014-01-21 12:40:42 -0800  6392)                     'ip'   => array( 700, 30 ),
02f3863a (Aaron Schulz             2014-01-21 12:40:42 -0800  6393)                     'user' => array( 700, 30 ),
c78a54c9 (Aaron Schulz             2013-10-16 16:14:35 -0700  6394)             ),
9643d682 (Aaron Schulz             2014-04-21 09:30:53 -0700  6395)             'renderfile-nonstandard' => array(
9643d682 (Aaron Schulz             2014-04-21 09:30:53 -0700  6396)                     // 140 new thumbnails per minute
9643d682 (Aaron Schulz             2014-04-21 09:30:53 -0700  6397)                     'ip'   => array( 70, 30 ),
9643d682 (Aaron Schulz             2014-04-21 09:30:53 -0700  6398)                     'user' => array( 70, 30 ),
9643d682 (Aaron Schulz             2014-04-21 09:30:53 -0700  6399)             ),
Comment 15 555 2014-05-05 11:47:41 UTC
For what reason an experienced developer was set such very low limit in an environment with the size of Wikimedia, with each category view listing tons of media files and in the exact time that an international upload contest (Wiki Loves Earth) is running??
Comment 16 wieralee 2014-05-05 18:46:21 UTC
It makes our work on wikisource.pl twice slower. It crashes our work :-( 
% of proofread pages loads without the scans. Very, very tiring.
Comment 17 wieralee 2014-05-05 18:47:41 UTC
(In reply to wieralee from comment #16)
40 %
Comment 18 555 2014-05-05 21:49:40 UTC
50 hours since the initial report and no single action directly related on fixing it.

Why the change that is *broking all Wikisource wikis* (we *really* rely on ProofreadPage and ProofreadPage relies on image resing!) isn't simply reverted until a sysadmin found the desidered setup? A config intended only to optimize server usage (I'm unable to found any report mentioning that this change is really needed at this moment) is really necessary if it breakes features that are working for years?
Comment 19 Antoine "hashar" Musso (WMF) 2014-05-06 08:46:26 UTC
From a mail sent to MediaWiki core list:

By looking at the udp2log limiter.log file, the renderfile-nonstandard
limit is reached by:


$ fgrep renderfile limiter.log |cut -d\: -f4|sort|uniq -c|sort -n
    378  10.64.0.168 tripped! mediawiki
    405  10.64.0.167 tripped! mediawiki
    476  10.64.32.92 tripped! mediawiki
    498  10.64.16.150 tripped! mediawiki
$

They are the media server frontends ms-fe1001 to ms-fe1004. We probably
want to restrict the end user IP instead.

I suspect the media servers are not properly passing the X-Forwarded-For
header down to the thumbnail renderer. Seems the logic is in
operations/puppet.git file ./files/swift/SwiftMedia/wmf/rewrite.py


Would need someone with more informations about Swift/Thumb handling
than me :-(

-----------

I have poked Faidon about it, the X-Forwarded-For headers seems to be passed by the Swift proxies, we need their IP to be trusted by MediaWiki.
Comment 20 Gerrit Notification Bot 2014-05-06 08:48:45 UTC
Change 131669 had a related patch set uploaded by Hashar:
Trust Swift proxies XFF headers

https://gerrit.wikimedia.org/r/131669
Comment 21 Gerrit Notification Bot 2014-05-06 08:49:19 UTC
Change 131670 had a related patch set uploaded by Faidon Liambotis:
Add Swift frontends to squid.php

https://gerrit.wikimedia.org/r/131670
Comment 22 Gerrit Notification Bot 2014-05-06 08:52:10 UTC
Change 131669 abandoned by Hashar:
Trust Swift proxies XFF headers

Reason:
Abandoned in favor of Faidon change https://gerrit.wikimedia.org/r/#/c/131670/

https://gerrit.wikimedia.org/r/131669
Comment 23 Gerrit Notification Bot 2014-05-06 08:54:39 UTC
Change 131671 had a related patch set uploaded by Hashar:
Mention ms-fe servers need to be XFF trusted by MW

https://gerrit.wikimedia.org/r/131671
Comment 24 Gerrit Notification Bot 2014-05-06 08:56:29 UTC
Change 131670 merged by jenkins-bot:
Add Swift frontends to squid.php

https://gerrit.wikimedia.org/r/131670
Comment 25 Gerrit Notification Bot 2014-05-06 08:56:53 UTC
Change 131671 merged by Faidon Liambotis:
Mention ms-fe servers need to be XFF trusted by MW

https://gerrit.wikimedia.org/r/131671
Comment 26 Faidon Liambotis 2014-05-06 09:07:39 UTC
Hashar was correct in identifying the root cause. This was a long-standing (~2 years) configuration error that in combination with the recent per-IP thumb limits broke generation for many users.

The above changes have been merged and deployed, so this should be working for everyone now. The logs suggest so, but let's give it some time..
Comment 27 Derk-Jan Hartman 2014-05-06 09:39:35 UTC
Can we do anything to make the cause of such incidents more easily visible/debuggable in the future ?
Comment 28 Derk-Jan Hartman 2014-05-06 09:40:38 UTC
perhaps including the IP being limited in the error ?
Comment 29 Andre Klapper 2014-05-06 11:40:54 UTC
Hashar / Faidon: Thanks for your work and investigation!
Comment 30 Yann Forget 2014-05-06 17:40:11 UTC
Works for me now.
However, as 555 said, I wish that such an issue which breaks all Wikisource work, to be better handled in the future. Thanks for fixing this.
Comment 31 Antoine "hashar" Musso (WMF) 2014-05-06 20:55:20 UTC
Derk-Jan Hartman: to customize the error message, I guess you want to fill another bug :-) We be easier to handle.

Yann Forget: the bug did get escalated to the mw-core weekly meeting (Monday 10pm UTC). Got fixed whenever we managed to wake up.  If the issue is critical, your best bet is to raise it on wikitech-l which most people with cluster access read even during week-ends.

If there is no more suspicious entries in limiter.log, I guess we can mark this bug as fixed finally.
Comment 32 Antoine "hashar" Musso (WMF) 2014-05-07 09:30:55 UTC
I posted a rather long postmortem describing:
- the timeline for the resolution
- the root cause analysis and how we caused the issue
- suggestion improvements

http://lists.wikimedia.org/pipermail/mediawiki-core/2014-May/000068.html


The media servers are no more limited according to limiter.log.  Whitelisting them as trusted XFF solved the issue.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links