Last modified: 2014-11-04 22:53:01 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 3361 - Image author, description, and copyright data saved in EXIF fields
Image author, description, and copyright data saved in EXIF fields
Status: NEW
Product: MediaWiki
Classification: Unclassified
Uploading (Other open bugs)
All All
: Low enhancement with 6 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: testme
Depends on:
  Show dependency treegraph
Reported: 2005-09-04 22:33 UTC by alterego
Modified: 2014-11-04 22:53 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description alterego 2005-09-04 22:33:26 UTC
I brought this up at Brion's "Extending MediaWiki" chat and was encouraged to
file a bug report for it here. The stated problem is that an image dump contains
no metadata concerning any of the images. Now that we have some level of EXIF
integration, it seems logical to put image author, copyright, and description
information into the appropriate EXIF fields. E.g., editing an image page makes
an actual edit to the image itself. This no doube raises many issues, such as
having wikitext inside an exif field, the md5 sums of the image (which I believe
is how they are roughly stored in their directories), but it also solves a very
large problem. Principally that our image dumps are distributed with absolutely
no meta-data, which encourages irresponsibility in their use and could be
illegal, as many of them are used by us as fair use.
Comment 1 alterego 2005-09-04 22:34:14 UTC
Corellary to bug 657
Comment 2 Andre Klapper 2014-02-24 18:07:54 UTC
(In reply to alterego from comment #0)
> an image dump contains no metadata concerning any of the images

What are ways to reproduce the problem nowadays? How to get an "image dump" in 2014, so to say? Is this still a problem?
Comment 3 alterego 2014-02-24 19:47:02 UTC
Do the EXIF data about images contain copyright information etc? If not, the bug should be left open, and probably elevated in importance.
Comment 4 Andre Klapper 2014-02-24 21:49:50 UTC
Errm, I'm a bit confused by the counter questions. 
Could you answer comment 2, please?

Plus this is de-facto low priority and not planned for a future release, until somebody provides a patch. Resetting Target Milestone and priority to previous values which seem more realistic.
Comment 5 Bawolff (Brian Wolff) 2014-02-24 22:08:48 UTC
Ok to clarify:
*fileare not stored by their md5sum, its the md5sum of the file *name*. Deleted files do use their sha1 sum as file name.
*however we still make the assumption pretty much everywhere that each version of the file has a constant sha1 sum/is bit for bit identical. So any change must be a reupload.
*the file versioning code is not well adapted to having an excessively large number of versions of a file. (If an edit->pseudo new upload, it would probably explode if someone made 5000 edits, especially to a large file)
*to do this automatically (or perhaps to have mallable metadata included with the dump), it might be easier once wikidata hits commons.
*the most likely solution, at least in the meantime, i think would be to have an extension hook up to exiftool, which allows people to modify exif on the server side triggering upload. (Perhaps with button to import data from wikipage). This wouldnt be as quick as as total automation, but would be something, and more easily turned off if their is an issue 

re andre, well we dont really have image dumps anymore (afaik, which is sad) the bug equally applies to people reusing our images from any form, or just wget'ing them off the server. The original poster wants the data from the image wikipage to be directly embedded in the file so that the data cannot be separated from the file (without malicious intent) where currently its common for reusers to lose this data if they dont care.

I agree this would be nice, think it may be difficult to do (fully) given our current infrastructure, and ultimately is a low priority compared to other more pressing issues we have with media files.
Comment 6 Quim Gil 2014-02-25 19:07:52 UTC
(In reply to Bawolff (Brian Wolff) from comment #5)
> I agree this would be nice, think it may be difficult to do (fully)

Difficult as in time-consuming or as in really complex. I'm thing whether this cold become a GSoC project idea one day (not in the current round).
Comment 7 Bawolff (Brian Wolff) 2014-02-25 19:12:17 UTC
Very complex to do it fully (The original request of auto recording edits into image metadata). Doing it in a somewhat superficial manner (Just having an on-wiki interface to edit metadata) might potentially be gsoc worthy (Kind of like a continuation of my gsoc project from 2010)

Note You need to log in before you can comment on or make changes to this bug.