Last modified: 2013-03-29 13:42:02 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T30188, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 28188 - Can't upload PDF / ODF Hybrid
Can't upload PDF / ODF Hybrid
Status: NEW
Product: MediaWiki
Classification: Unclassified
Uploading (Other open bugs)
1.18.x
All All
: Normal normal with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: 41037
  Show dependency treegraph
 
Reported: 2011-03-22 15:55 UTC by fun-stuff
Modified: 2013-03-29 13:42 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Test PDF/ODF (ODT) Hybrid Document (38.34 KB, application/force-download)
2011-03-22 17:55 UTC, fun-stuff
Details

Description fun-stuff 2011-03-22 15:55:35 UTC
The new LibreOffice supports exporting PDFs in a hybrid ODF / PDF file format. When trying to upload such a file MediaWiki reports that the ZIP file is ambiguous or has been damaged. (I have a German installation so I can't tell you the exact error message.)

I already took ZIP files out of the MediaWiki Blacklist and added the file extensions PDF, ODT and ZIP.

I think includes/ZIPDirectoryReader.php checks the file and throws out the error cause it doesn't know the new file format yet.

The new PDF / ODF hybrid format makes it easy to open documents for everyone while maintaining the possibility to edit them which might also be a great thing for Wikipedia. Therefore, this is a major bug for me.
Please fix this and thanks for the great software.

Tobias
Comment 1 Max Semenik 2011-03-22 15:57:56 UTC
Can you attach a sample file or provide a link to it?
Comment 2 fun-stuff 2011-03-22 17:55:01 UTC
Created attachment 8320 [details]
Test PDF/ODF (ODT) Hybrid Document

Can be edited with LibreOffice 3.3 Writer and viewed with any PDF viewer. But it cannot be uploaded in MediaWiki 1.18alpha.
Comment 3 Max Semenik 2011-03-22 18:32:10 UTC
The cause is "ZipDirectoryReader: Fatal error: trailing bytes after the end of the file comment".

In simple words, we expect zip files to be... zip files and not contain something scary. We need to hack our detector to handle zips embedded in something known.
Comment 4 fun-stuff 2011-03-23 11:46:28 UTC
Thanks for clarification und for taking care of the problem so quickly. I hope you can fix this bug in the near future.
Comment 5 Mark A. Hershberger 2011-04-26 03:34:01 UTC
From comments in triage:

"Workaround: 'don't save your PDF that way'. (Problem with workaround: if someone else made the file, you might not know how to re-save it.)"

So, we thought about dealing with it: "This presents same security threats as a PDF file.... need to check security model, probable threats."

"Our security checks are working as intended by detecting that the files have been smashed together unexpectedly. Might be possible to tweak it to consider 'oh that's ok' but not sure how much we want to. If not careful might accidentally allow all sorts of evil appended to a PDF file."
Comment 6 fun-stuff 2011-04-26 09:14:24 UTC
Thanks for the comments.

I can imagine that deciding whether this is an 'OK' PDF file saved as hybrid ODF or not is difficult to code. However, I think it would be a great loss if this wasn't implemented as this format is so versatile.
Comment 7 Dovi Jacobs 2013-02-21 09:17:35 UTC
Hi, I asked about this problem here (and was referred to this bug):
http://commons.wikimedia.org/wiki/Commons:Village_pump#Uploading_embedded_PDFs_created_through_LibreOffice

The embedded PDF is an extremely useful file format, and one of the best features in the open source LibreOffice project. It is becoming extremely popular and is already being used in hundreds of millions of files around the world.

Therefore, I'd like to reiterate the comment before mine, which was made nearly two years ago: "I think it would be a great loss if this wasn't implemented as this format is so versatile."

If that was true two years ago, it is far more true today. I hope it can be made a basic part of PDF support in Wikimedia projects.
Comment 8 Dovi Jacobs 2013-03-29 13:42:02 UTC
In the meantime I've been uploading classic texts and educational materials at Internet Archive instead of at the Commons:
http://commons.wikimedia.org/wiki/Category:Talmud_(digital_text_vowelized_and_formatted)

This is extremely inconvenient for proper use at Wikimedia projects. I hope this will be taken care of eventually.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links