Last modified: 2014-11-04 22:53:01 UTC
I brought this up at Brion's "Extending MediaWiki" chat and was encouraged to
file a bug report for it here. The stated problem is that an image dump contains
no metadata concerning any of the images. Now that we have some level of EXIF
integration, it seems logical to put image author, copyright, and description
information into the appropriate EXIF fields. E.g., editing an image page makes
an actual edit to the image itself. This no doube raises many issues, such as
having wikitext inside an exif field, the md5 sums of the image (which I believe
is how they are roughly stored in their directories), but it also solves a very
large problem. Principally that our image dumps are distributed with absolutely
no meta-data, which encourages irresponsibility in their use and could be
illegal, as many of them are used by us as fair use.
Corellary to bug 657 http://bugzilla.wikimedia.org/show_bug.cgi?id=657
(In reply to alterego from comment #0)
> an image dump contains no metadata concerning any of the images
What are ways to reproduce the problem nowadays? How to get an "image dump" in 2014, so to say? Is this still a problem?
Do the EXIF data about images contain copyright information etc? If not, the bug should be left open, and probably elevated in importance.
Errm, I'm a bit confused by the counter questions.
Could you answer comment 2, please?
Plus this is de-facto low priority and not planned for a future release, until somebody provides a patch. Resetting Target Milestone and priority to previous values which seem more realistic.
Ok to clarify:
*fileare not stored by their md5sum, its the md5sum of the file *name*. Deleted files do use their sha1 sum as file name.
*however we still make the assumption pretty much everywhere that each version of the file has a constant sha1 sum/is bit for bit identical. So any change must be a reupload.
*the file versioning code is not well adapted to having an excessively large number of versions of a file. (If an edit->pseudo new upload, it would probably explode if someone made 5000 edits, especially to a large file)
*to do this automatically (or perhaps to have mallable metadata included with the dump), it might be easier once wikidata hits commons.
*the most likely solution, at least in the meantime, i think would be to have an extension hook up to exiftool, which allows people to modify exif on the server side triggering upload. (Perhaps with button to import data from wikipage). This wouldnt be as quick as as total automation, but would be something, and more easily turned off if their is an issue
re andre, well we dont really have image dumps anymore (afaik, which is sad) the bug equally applies to people reusing our images from any form, or just wget'ing them off the server. The original poster wants the data from the image wikipage to be directly embedded in the file so that the data cannot be separated from the file (without malicious intent) where currently its common for reusers to lose this data if they dont care.
I agree this would be nice, think it may be difficult to do (fully) given our current infrastructure, and ultimately is a low priority compared to other more pressing issues we have with media files.
(In reply to Bawolff (Brian Wolff) from comment #5)
> I agree this would be nice, think it may be difficult to do (fully)
Difficult as in time-consuming or as in really complex. I'm thing whether this cold become a GSoC project idea one day (not in the current round).
Very complex to do it fully (The original request of auto recording edits into image metadata). Doing it in a somewhat superficial manner (Just having an on-wiki interface to edit metadata) might potentially be gsoc worthy (Kind of like a continuation of my gsoc project from 2010)