Last modified: 2014-10-27 00:25:37 UTC
Several images have suddenly decided to simply refuse to display, but if you download (though, oddly, NOT if you simply click on the "Full resolution" link to view it, at least in Firefox), they work fine. Examples: http://commons.wikimedia.org/wiki/File:Suikoden.jpg http://commons.wikimedia.org/wiki/File:Somagahana_Fuchiemon_restored.jpg http://commons.wikimedia.org/wiki/File:Somagahana_Fuchiemon.jpg It's been pointed out that there are interesting error messages: http://upload.wikimedia.org/wikipedia/commons/thumb/d/d5/Suikoden.jpg/411px-Suikoden.jpg gives: '''Error generating thumbnail''' Error creating thumbnail: convert: Insufficient memory (case 4) `/mnt/upload5/wikipedia/commons/d/d5/Suikoden.jpg'. convert: missing an image filename `/mnt/upload5/wikipedia/commons/thumb/d/d5/Suikoden.jpg/411px-Suikoden.jpg'. Think you can fix it? ~~~~
This is a problem that's inhibiting access to featured content. ~~~~
Do not use interlaced (a.k.a. progressive) JPEG compression. This option greatly increases the amount of memory required for decompression, and thus reduces performance both for the server and for clients such as browsers. All three cited test cases use this compression mode. I have uploaded one of the three files with interlacing removed: <http://commons.wikimedia.org/wiki/File:Suikoden_(no_interlace).jpg> As you can see, it works just fine. You can do this with ImageMagick using: convert Source.jpg -interlace none Destination.jpg Omitting the -interlace, i.e. a null convert, also appears to work.
More examples from #wikimedia-tech: [[File:Panorama_-_Ch%C3%A2teau_des_ducs_de_Bourbon_%C3%A0_Montlu%C3%A7on_depuis_l%27esplanade.JPG]], [[File:1966_map_of_the_Appalachian_Development_Highway_System.jpg]]. Isn't there a list of interlaced images? They could be replaced with non-interlaced versions by some bot.
*** Bug 36733 has been marked as a duplicate of this bug. ***
Would it be possible to change the interlace automatically during the upload? I run into this problem quite a few time since it looks like some version of GIMP save everything in the interlace mode by default.
bug #24228 can be fixed as a dupl. of this one?!
*** Bug 24228 has been marked as a duplicate of this bug. ***
*** Bug 37367 has been marked as a duplicate of this bug. ***
Tim, which of the JPEG SOF tags identify a non-interlaced image (good for us)? http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/JPEG.html#SOF 0x0 = Baseline DCT, Huffman coding 0x1 = Extended sequential DCT, Huffman coding 0x2 = Progressive DCT, Huffman coding 0x3 = Lossless, Huffman coding 0x5 = Sequential DCT, differential Huffman coding 0x6 = Progressive DCT, differential Huffman coding 0x7 = Lossless, Differential Huffman coding 0x9 = Extended sequential DCT, arithmetic coding 0xa = Progressive DCT, arithmetic coding 0xb = Lossless, arithmetic coding 0xd = Sequential DCT, differential arithmetic coding 0xe = Progressive DCT, differential arithmetic coding 0xf = Lossless, differential arithmetic coding ('exiftool -fast2' is a couple orders of magnitude faster than 'identify -verbose'.)
@Nemo_bis: Quoting Tim : Do not use interlaced (a.k.a. ***progressive***)
(In reply to comment #10) > @Nemo_bis: > Quoting Tim : Do not use interlaced (a.k.a. ***progressive***) Sorry, I don't see how this answers my question. Do you mean that all sequential, lossless etc. encodings there are ok (and why)?
I dont think lossless is ok. I stumbled upon some lossless jpegs lately which could not be read with any program. (Sry, but I cant remember the SOF tag)
IMHO, either a server software supports rendering huge progressive JPEGs or it refuses them while uploading or it converts them directly after uploading. With Upload Wizard and some modern browsers you can even try to detect those file at the client side before uploading. VirusTotal is e.g. computing a hash at the client before they upload the file in order to save server capacity. So it should be possible to read JPEG file headers. Progressive JPEGs aren't created by digital cameras. Thus, their origin is in imaging-software. It is often just unchecking a check box. But the user has to know this. Current behaviour is NOT OK. I would be inclined reopening this bug.
Created attachment 11219 [details] List of Commons non-baseline images above 5 MB Here's the first list I made with exiftool (27884 images above 5 MB).
Created attachment 11220 [details] List of Commons non-baseline images above 5 MB Better as explicit attachment for archiving.
Created attachment 11500 [details] List of 559678 Commons non-baseline images below 5 MB
someone at commons is now converting everything: https://commons.wikimedia.org/wiki/Commons:Bots/Work_requests#Convert_all_interlaced_JPGs This can't be desired behaviour, come on, wake up.
That seems indeed a bit much to convert files that are technically perfectly fine (as the thumbnail is properly generated)..
Since when does a single programmer get to set policy for the entirety of Wikimedia? There is a bug here. Even if progressive images take up more memory, the fact that the system is not waiting and allocating the correct amount of memory is a bug. Progressive JPEGs are going to be uploaded whether you want them to be or not. Most images on Wikipedia are uploaded from the web, and most web JPEGs are progressive, as progressive JPEGs make smaller files. In fact, I personally have no intent to stop using progressive JPEGs since I've been using them since 2007 without incident. Lots of things editors do puts a large memory load on the server. We aren't required to try to make it easier on the system. I've been using progressive JPEGs on Wikimedia for years, and I've not run into a problem. If I do, then maybe I'll convert, but not until then. I'm not going to condone a programmer changing policy in order to avoid fixing a bug. And don't say you haven't changed policy. You put a demand on all Wikimedia users that they do a certain thing a certain way, even though the other way works. That's a policy change. It's even listed at the Commons Help:JPEG.
Left out something: If there is definitely a bug, you have two choices. You can try to fix it, or you leave it open so that someone else can fix it. You do not close a legitimate bug by telling people that they are required to work around it. And this is a legitimate bug, as there's no way the servers were coincidentally that close to capacity every time the bug reporter tried to generate the thumbnail. Either enough memory is not being allocated or there's a bug requiring a lot more memory for this file than for other progressive JPEGs which work just fine.
trlkly: No idea which "policy" thing you talk about, but maintainers of a codebase are free to decide that they are actively against fixing a valid bug in the software if this would create side effects ("reduces performance both for the server and for clients such as browsers") that they considered worse.
Nope. Not in open source. (In reply to comment #21) > trlkly: No idea which "policy" thing you talk about, but maintainers of a > codebase are free to decide that they are actively against fixing a valid bug > in the software if this would create side effects ("reduces performance both > for the server and for clients such as browsers") that they considered worse. They are allowed to refuse patches if they think the patches have downsides, yes, but not to arbitrarily declare that all such patches must have that downside. And note the word "they" rather than "he." This was a single person making the decision, without even entertaining the idea that someone might have a way to handle it. And, in fact, there are multiple ways of getting around the issues he stated. There's no inherent reason that progressive JPEGs take longer to render than baseline JPEGs. It isn't the case on any modern software. It isn't the case that they must take up a lot more memory as, unlike thumbnailing, converting between the two can be done without full decompression. Thus the memory requirements are as low as you can stand having to go back to the disk to read more of the file. Furthermore, thumbnailing a progressive JPEG often requires less of the JPEG to be rendered, since you only have to render up to the resolution just above the thumbnail. Progressive JPEGs essentially have their own thumbnails baked in. There are multiple solutions that could deal with this problem without causing significant drain on the system. Most of them came in after the guy arbitrarily closed the bug without waiting for ideas on how to mitigate the problems. A bug should be left open if it is legitimate. Closing the bug prevents anyone else from coming up with a solution that mitigates all problems.
(In reply to comment #22) > They are allowed to refuse patches if they think the patches have downsides, > yes, but not to arbitrarily declare that all such patches must have that > downside. [...] > > And, in fact, there are multiple ways of getting around the issues he stated. > There's no inherent reason that progressive JPEGs take longer to render than > baseline JPEGs. It isn't the case on any modern software. Have you brought this up with ImageMagick, then? You could also submit a patch to them, as you mention that. (Note, there's also VIPS but I don't think we ever use it for JPEG. https://blog.wikimedia.org/2013/09/12/vipsscaler-implementation-wikimedia-sites/ )
(In reply to comment #19) > Since when does a single programmer get to set policy for the entirety of > Wikimedia? Since before it was called Wikimedia. That's not to say it's a good decision-making system. I'm happy to hear other opinions or for others to submit patches in this area. > Progressive JPEGs are going to be uploaded whether you want them to be or > not. It's not ideal to have bots convert them. I would prefer it if they were rejected on upload. > And don't say you haven't changed policy. You put a demand on all Wikimedia > users that they do a certain thing a certain way, even though the other way > works. That's a policy change. It's even listed at the Commons Help:JPEG. Sure, changing policy is a hack, in the absence of a feature which would reject these files on upload. If they were rejected on upload, then we could set a threshold based on available server memory, instead of having bot authors guess at what that threshold should be. (In reply to comment #22) > Furthermore, thumbnailing a progressive JPEG often requires less of the JPEG > to be rendered, since you only have to render up to the resolution just above > the thumbnail. Progressive JPEGs essentially have their own thumbnails > baked in. Maybe if the browsers or the image scaling software we use took advantage of this, then you would have a point. But as it stands, it's not really a good subject for a bug against MediaWiki. It would be a good subject for a bug against ImageMagick. > There are multiple solutions that could deal with this problem without > causing significant drain on the system. Most of them came in after > the guy arbitrarily closed the bug without waiting for ideas on how > to mitigate the problems. Everyone should feel free to submit ideas about bugs that are closed "WONTFIX". > A bug should be left open if it is legitimate. I think WONTFIX was an appropriate way to describe the situation. > Closing the bug prevents > anyone else from coming up with a solution that mitigates all problems. By what mechanism? It's not like we're preventing comments on the bug, or telling upstream projects like libvips or ImageMagick to reject your patches.
(In reply to Tim Starling from comment #24) > It's not ideal to have bots convert them. I would prefer it if they were > rejected on upload. From the usability point of view, that's horrible. I am happy when users understand what JPEG and PNG is at all. Coming from Facebook, they call everything a "Pic" and when you reject progressive JPEGs with a message like: "Progressive JPEGs must not be uploaded here, instead use baseline because it's better for our servers", I am sure we will succeed in confusing 90% of the new uploaders receiving this message. BTW, do we still use ImageMagic for JPEGs or VIPS?