Last modified: 2014-05-07 09:30:55 UTC
Some thumbnails from DjVu files are not created. This happens on at least 2 files: 1. https://fr.wikisource.org/wiki/Page%3ARevue_des_Deux_Mondes_-_1843_-_tome_3.djvu/970 from file https://commons.wikimedia.org/wiki/File:Revue_des_Deux_Mondes_-_1843_-_tome_3.djvu 2. and several pages for https://commons.wikimedia.org/wiki/File:Kropotkine_%E2%80%94_Paroles_d%27un_R%C3%A9volt%C3%A9.djvu https://fr.wikisource.org/wiki/Page:Kropotkine_%E2%80%94_Paroles_d%27un_R%C3%A9volt%C3%A9.djvu/158 https://fr.wikisource.org/wiki/Page:Kropotkine_%E2%80%94_Paroles_d%27un_R%C3%A9volt%C3%A9.djvu/186 https://fr.wikisource.org/wiki/Page:Kropotkine_%E2%80%94_Paroles_d%27un_R%C3%A9volt%C3%A9.djvu/205
Comments: * Purging files at Commons has no effect * Clicking the "Other resolutions:" gives the error Error generating thumbnail Error creating thumbnail: File missing * Full image appears to display okay https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/Revue_des_Deux_Mondes_-_1843_-_tome_3.djvu/page970-2840px-Revue_des_Deux_Mondes_-_1843_-_tome_3.djvu.jpg
Again, many pages on this file: https://fr.wikisource.org/wiki/Page:Harlez_-_Avesta,_livre_sacr%C3%A9_du_Zoroastrisme.djvu/934
Change 130563 had a related patch set uploaded by Aaron Schulz: Removed "GetLocalFileCopy" pool counter entry https://gerrit.wikimedia.org/r/130563
Change 130563 merged by jenkins-bot: Removed "GetLocalFileCopy" pool counter entry https://gerrit.wikimedia.org/r/130563
Aaron: You are fast. Thank you!
Similar issue again now: I get a message "Error generating thumbnail As an anti-spam measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes." in at least about one in every 3 pages.
https://upload.wikimedia.org/wikipedia/commons/thumb/c/cd/Julien_-_Les_Avad%C3%A2nas%2C_contes_et_apologues_indiens%2C_tome_2.djvu/page3-1024px-Julien_-_Les_Avad%C3%A2nas%2C_contes_et_apologues_indiens%2C_tome_2.djvu.jpg image loaded after 3 forced reload
*** Bug 64801 has been marked as a duplicate of this bug. ***
Changing summary; the error is widespread across all sorts of users of Commons.
555: Resetting blocker and immediate; see [[mw:Bugzilla/Fields#Priority]]
I as well get currently frequent error 500's after requesting a thumbnail image: "Error generating thumbnail - As an anti-spam measure, you are limited from performing this action too many times in a short space of time, and you have exceeded this limit. Please try again in a few minutes." This happens quite fast (I requested perhaps around 100 thumbnails in the last few hours). But it also resolves quite fast. Retrying it shortly after, usually results in a OK 200.
It seems this is getting more and more frequent. When is a fix expected? Thanks.
This seems to be hitting $wgRateLimits['renderfile']. See [1]. Those rate limits are disabled by default, so maybe WMF has set them up recently. ----- [1] https://www.mediawiki.org/wiki/Manual:$wgRateLimits
(In reply to Jesús Martínez Novo (Ciencia Al Poder) from comment #13) > This seems to be hitting $wgRateLimits['renderfile']. See [1]. Those rate > limits are disabled by default, so maybe WMF has set them up recently. $ git blame InitialiseSettings.php | grep -A 4 renderfile c78a54c9 (Aaron Schulz 2013-10-16 16:14:35 -0700 6390) 'renderfile' => array( 02f3863a (Aaron Schulz 2014-01-21 12:40:42 -0800 6391) // 1400 new thumbnails per minute 02f3863a (Aaron Schulz 2014-01-21 12:40:42 -0800 6392) 'ip' => array( 700, 30 ), 02f3863a (Aaron Schulz 2014-01-21 12:40:42 -0800 6393) 'user' => array( 700, 30 ), c78a54c9 (Aaron Schulz 2013-10-16 16:14:35 -0700 6394) ), 9643d682 (Aaron Schulz 2014-04-21 09:30:53 -0700 6395) 'renderfile-nonstandard' => array( 9643d682 (Aaron Schulz 2014-04-21 09:30:53 -0700 6396) // 140 new thumbnails per minute 9643d682 (Aaron Schulz 2014-04-21 09:30:53 -0700 6397) 'ip' => array( 70, 30 ), 9643d682 (Aaron Schulz 2014-04-21 09:30:53 -0700 6398) 'user' => array( 70, 30 ), 9643d682 (Aaron Schulz 2014-04-21 09:30:53 -0700 6399) ),
For what reason an experienced developer was set such very low limit in an environment with the size of Wikimedia, with each category view listing tons of media files and in the exact time that an international upload contest (Wiki Loves Earth) is running??
It makes our work on wikisource.pl twice slower. It crashes our work :-( % of proofread pages loads without the scans. Very, very tiring.
(In reply to wieralee from comment #16) 40 %
50 hours since the initial report and no single action directly related on fixing it. Why the change that is *broking all Wikisource wikis* (we *really* rely on ProofreadPage and ProofreadPage relies on image resing!) isn't simply reverted until a sysadmin found the desidered setup? A config intended only to optimize server usage (I'm unable to found any report mentioning that this change is really needed at this moment) is really necessary if it breakes features that are working for years?
From a mail sent to MediaWiki core list: By looking at the udp2log limiter.log file, the renderfile-nonstandard limit is reached by: $ fgrep renderfile limiter.log |cut -d\: -f4|sort|uniq -c|sort -n 378 10.64.0.168 tripped! mediawiki 405 10.64.0.167 tripped! mediawiki 476 10.64.32.92 tripped! mediawiki 498 10.64.16.150 tripped! mediawiki $ They are the media server frontends ms-fe1001 to ms-fe1004. We probably want to restrict the end user IP instead. I suspect the media servers are not properly passing the X-Forwarded-For header down to the thumbnail renderer. Seems the logic is in operations/puppet.git file ./files/swift/SwiftMedia/wmf/rewrite.py Would need someone with more informations about Swift/Thumb handling than me :-( ----------- I have poked Faidon about it, the X-Forwarded-For headers seems to be passed by the Swift proxies, we need their IP to be trusted by MediaWiki.
Change 131669 had a related patch set uploaded by Hashar: Trust Swift proxies XFF headers https://gerrit.wikimedia.org/r/131669
Change 131670 had a related patch set uploaded by Faidon Liambotis: Add Swift frontends to squid.php https://gerrit.wikimedia.org/r/131670
Change 131669 abandoned by Hashar: Trust Swift proxies XFF headers Reason: Abandoned in favor of Faidon change https://gerrit.wikimedia.org/r/#/c/131670/ https://gerrit.wikimedia.org/r/131669
Change 131671 had a related patch set uploaded by Hashar: Mention ms-fe servers need to be XFF trusted by MW https://gerrit.wikimedia.org/r/131671
Change 131670 merged by jenkins-bot: Add Swift frontends to squid.php https://gerrit.wikimedia.org/r/131670
Change 131671 merged by Faidon Liambotis: Mention ms-fe servers need to be XFF trusted by MW https://gerrit.wikimedia.org/r/131671
Hashar was correct in identifying the root cause. This was a long-standing (~2 years) configuration error that in combination with the recent per-IP thumb limits broke generation for many users. The above changes have been merged and deployed, so this should be working for everyone now. The logs suggest so, but let's give it some time..
Can we do anything to make the cause of such incidents more easily visible/debuggable in the future ?
perhaps including the IP being limited in the error ?
Hashar / Faidon: Thanks for your work and investigation!
Works for me now. However, as 555 said, I wish that such an issue which breaks all Wikisource work, to be better handled in the future. Thanks for fixing this.
Derk-Jan Hartman: to customize the error message, I guess you want to fill another bug :-) We be easier to handle. Yann Forget: the bug did get escalated to the mw-core weekly meeting (Monday 10pm UTC). Got fixed whenever we managed to wake up. If the issue is critical, your best bet is to raise it on wikitech-l which most people with cluster access read even during week-ends. If there is no more suspicious entries in limiter.log, I guess we can mark this bug as fixed finally.
I posted a rather long postmortem describing: - the timeline for the resolution - the root cause analysis and how we caused the issue - suggestion improvements http://lists.wikimedia.org/pipermail/mediawiki-core/2014-May/000068.html The media servers are no more limited according to limiter.log. Whitelisting them as trusted XFF solved the issue.