Last modified: 2012-04-16 09:15:52 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T29508, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 27508 - SVGMetadataExtractor is taking too much memory/time on large svgs, rendering certain pages inaccessible
SVGMetadataExtractor is taking too much memory/time on large svgs, rendering ...
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
File management (Other open bugs)
unspecified
All All
: High major (vote)
: ---
Assigned To: Nobody - You can work on this!
http://commons.wikimedia.org/wiki/Cat...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-02-17 20:26 UTC by Lupo
Modified: 2012-04-16 09:15 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
patch to fix this (2.26 KB, patch)
2011-03-04 22:05 UTC, Derk-Jan Hartman
Details
Stop after getting metadata (558 bytes, patch)
2011-03-04 22:57 UTC, Platonides
Details
Patch to only look at so much of the svg file for metadata. (2.40 KB, patch)
2011-03-05 02:02 UTC, Bawolff (Brian Wolff)
Details

Description Lupo 2011-02-17 20:26:43 UTC
See the URL given above. Error was reported on the French village pump at the Commons,

http://commons.wikimedia.org/w/index.php?title=Commons:Bistro&oldid=49919318#Et_sous_Firefox_.3F

The page User:Sting just is not served. After a loooong time (about 4-5 minutes), one gets a WikiMedia error page saying

Request: GET http://commons.wikimedia.org/wiki/User:Sting, from <MY IP OMITTED> via amssq43.esams.wikimedia.org (squid/2.7.STABLE7) to 91.198.174.35 (91.198.174.35)
Error: ERR_READ_TIMEOUT, errno [No Error] at Thu, 17 Feb 2011 20:09:23 GMT 

The user's page contains quite a few images. Don't know if that might be a problem.

Behavior confirmed in Firefox 3.6.13 (Mac OS X), Safari (Mac OS X), Firefox 3.6.4 (Win XP), IE6, Opera 10.60 (Win XP); both logged in and logged out.

The page is also not served through the secure server

https://secure.wikimedia.org/wikipedia/commons/wiki/User:Sting

it returns relatively quickly a completely unstyled page saying

Proxy Error

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET /wikipedia/commons/wiki/User:Sting.

Reason: Error reading from remote server

Apache/2.2.8 (Ubuntu) mod_fastcgi/2.4.6 PHP/5.2.4-2ubuntu5.12wm1 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g Server at secure.wikimedia.org Port 443

Marked as "major" because although so far this concerns only one page, I think it's worth investigating before we find other pages. It's not clear to me whether this is some networking problem, or a caching (squid) problem, or a wikitext parsing problem, or some other problem in the MediaWiki code.
Comment 1 Lupo 2011-02-18 07:49:34 UTC
It appears that this is caused by the reference

[[:File:Puerto_Rico_ecosystems_map-fr.svg]]

in User:Sting

Indeed, http://commons.wikimedia.org/wiki/File:Puerto_Rico_ecosystems_map-fr.svg also does not load.


However, even eliminating this link still leaves problems. Try clicking on the thumbnail at
http://commons.wikimedia.org/wiki/User:Lupo/q

or on the two links "SVG version" or "in French - raster": None of them is served! That's because they
all reference http://commons.wikimedia.org/wiki/Template:Other_versions/Puerto_Rico_ecosystems_map which in turn has references File:Puerto_Rico_ecosystems_map-fr.svg in a gallery.
Comment 2 Lupo 2011-02-18 15:21:25 UTC
Note that this also makes

http://commons.wikimedia.org/wiki/Category:Maps_of_Puerto_Rico

inaccessible.

It's also not possible to save an edit page if the wikitext contains an active (not commented out) wikilink to that file.
Comment 3 Lupo 2011-02-18 16:06:48 UTC
The file itself is there and loads and displays fine in Firefox 3.6.4:

http://upload.wikimedia.org/wikipedia/commons/4/43/Puerto_Rico_ecosystems_map-fr.svg
Comment 4 Lupo 2011-03-04 21:16:49 UTC
The user page where the problem was originally noticed

http://commons.wikimedia.org/wiki/User:Sting

has been edited in the meantime to circumvent the problem.

However, links to this SVG file still cause problems, such as

http://commons.wikimedia.org/wiki/Category:Maps_of_Puerto_Rico

being inaccessible.
Comment 5 Bawolff (Brian Wolff) 2011-03-04 22:01:37 UTC
When i try to upload the (13 mb svg) file on my local wiki. I get an error with svg metadata extractor exceeding max execution time, so I think its an issue with the new svg metadata extractor.

Should maybe not try to extract metadata if file is beyond a certain size.
Comment 6 Derk-Jan Hartman 2011-03-04 22:05:17 UTC
Created attachment 8241 [details]
patch to fix this
Comment 7 Platonides 2011-03-04 22:57:38 UTC
Created attachment 8242 [details]
Stop after getting metadata

(ensure you're at least at r83254)

We could also avoid this if we stopped parsing once we got the metadata tag.
There may be files with several <metadata> tags, though, for which we would only fetch the first one.
Comment 8 Bawolff (Brian Wolff) 2011-03-05 02:02:14 UTC
Created attachment 8245 [details]
Patch to only look at so much of the svg file for metadata.

How about we only look in the first 512 kb for metadata information

*Most svg files (ignoring the crazy maps) aren't even anywhere near 256 kb big
*The SVG metadata <title> and <desc> tags are almost always at the very beginning
*256 kb (Which i chose arbitrarily) of svg can be parsed pretty much instantaneously by our SVGReader class (in my tests anyways using eval.php)

Patch attached that does that. After using the patch I can successfully uploaded the Puerto_Rico_ecosystems_map-fr.svg to my wiki where before i ran into an execution time exceeded in SVGMetadataExtract type error. (Still took a long time, but i thing that's mostly from convert, which eventually gets killed by ulimit.sh) And parsing that svg using SVGReader is pretty much instantanous when done from eval.php (as i mentioned earlier in this comment) where before it took something like 7 minutes.
Comment 9 Bawolff (Brian Wolff) 2011-03-06 08:24:40 UTC
I committed that in r83374. Marking this fixed as that should fix the issue (at least on my test wiki it does, using [[:File:Puerto_Rico_ecosystems_map-fr.svg]])

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links