Last modified: 2014-06-30 15:30:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T32751, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 30751 - Allowed memory size exhausted when using API
Allowed memory size exhausted when using API
Status: RESOLVED WORKSFORME
Product: MediaWiki
Classification: Unclassified
File management (Other open bugs)
unspecified
All All
: Low minor (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 30906
Blocks:
  Show dependency treegraph
 
Reported: 2011-09-04 14:15 UTC by Svick
Modified: 2014-06-30 15:30 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Svick 2011-09-04 14:15:28 UTC
I'm trying to run the following query from my bot account:

http://en.wikipedia.org/w/api.php?format=xml&action=query&generator=categorymembers&prop=imageinfo&iiprop=&iilimit=2&gcmtitle=Category%3AAll%20non-free%20media&gcmlimit=max&maxlag=5

and I'm getting the following error:

PHP fatal error in /usr/local/apache/common-local/php-1.17/includes/Hooks.php line 47:
Allowed memory size of 125829120 bytes exhausted (tried to allocate 87 bytes) 

The same query seems to work when I set gcmlimit=1000.

I'm not sure whether something should be done about this, but if it should, it's probably lowering the limit for categorymembers or raising the memory limit.
Comment 1 Roan Kattouw 2011-09-04 14:17:14 UTC
I think this is more likely to be a problem in imageinfo or in the file/media backend.
Comment 2 Philippe Elie 2011-09-14 22:15:28 UTC
The above url now works but I get the same trouble with this query, it occurs even with a gcmlimit as little as 30, so lowering the limit for categorymembers seems pointless.

http://commons.wikimedia.org/w/api.php?gcmtitle=Category%3ADjVu%20files%20in%20French&generator=categorymembers&gcmlimit=30&prop=imageinfo&action=query&iiprop=sha1&gcmcontinue=file|4d494348415544202d2042494f4752415048494520554e4956455253454c4c4520414e4349454e4e45204554204d4f4445524e45202d2031383433202d20544f4d452031342e444a5655|7531758

PHP fatal error in /usr/local/apache/common-local/php-1.17/includes/objectcache/MemcachedClient.php line 979: 
Allowed memory size of 125829120 bytes exhausted (tried to allocate 4589559 bytes) 

979: $c_val = gzcompress( $val, 9 ); 

This cat contains djvu files with text layer, at the point of failure both file and text layer are big, 40/50 MB per file and 4/5 MB of text layer per file. I guess comment 1 is right, quite possible a trouble with file meta data caching.

I can get the same trouble with action=query&titles=<Thirty two titles>&prop=imageinfo&iilimit=1&iiprop=sha1, full url here: http://fr.wikisource.org/w/index.php?title=Utilisateur:Phe/Test4&oldid=2779398

Beside that I see no code path in ApiQueryImageInfo.php which ask metadata if not explicitly asked through the url.
Comment 3 Brion Vibber 2011-09-14 22:29:35 UTC
Yeah, the metadata blob goes into the cached File object, so it gets copied around and processed whether you ask for it or not.

Normally that's not a problem, as we don't store megabytes of random text in the metadata field. :) Unfortunately for DjVu images, we do... so if you load up a few of those files at once, *bam* they'll add up fast.

Fix is probably to separate out the extracted text storage to a structured data table, so it's not clogging up the tubes the other 99% of the time we don't need it.
Comment 4 Svick 2011-09-14 22:54:11 UTC
Philippe, The URL I gave above still doesn't work for me. Maybe you were trying it from non-bot account that has lower limits?
Comment 5 Philippe Elie 2011-09-14 23:46:01 UTC
(In reply to comment #4)
> Philippe, The URL I gave above still doesn't work for me. Maybe you were trying
> it from non-bot account that has lower limits?

Right, with a non-bot account the gcmlimit is set to 500 and the url works.
Comment 6 Philippe Elie 2011-09-15 15:23:32 UTC
(In reply to comment #3)
Can the priority of this bug be increased ? imageinfo is very useful, sha1 can be used to update a local cache of File: with little burden on server side. We can get many useful other information which in theory doesn't involve "metadata in the file" but only "metadata which are stored in the database in the image table", the current behavior seems against the design of other part of api.php.
Comment 7 Sam Reed (reedy) 2011-09-15 16:06:49 UTC
(In reply to comment #6)
> (In reply to comment #3)
> Can the priority of this bug be increased ? imageinfo is very useful, sha1 can
> be used to update a local cache of File: with little burden on server side. We
> can get many useful other information which in theory doesn't involve "metadata
> in the file" but only "metadata which are stored in the database in the image
> table", the current behavior seems against the design of other part of api.php.

You can increase it's priority, but that doesn't mean it'll get dealt with any quicker. Just FYI

And bug 30906 needs fixing first
Comment 8 Mark A. Hershberger 2012-05-28 17:56:45 UTC
Lowering priority on high priority bugs that have a low severity
Comment 9 Bawolff (Brian Wolff) 2014-05-23 00:42:30 UTC
Is this still an issue. I think we made it so djvu metadata is only loaded when absolutely needed (of the image table entry is in cache)
Comment 10 Svick 2014-06-30 15:30:53 UTC
I can't reproduce the original issue, so I guess I'll close this bug.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links