Last modified: 2011-10-15 22:02:51 UTC
Upload an image to a MediaWiki instance. Replace it with a different image. The page you get back lists both the current and archived version, and 120px thumbnails for both images. Delete the archived version. The original is gone, but the thumbnail remains. I found this in r84671, but AaronSchulz was able to reproduce it in a current version. Could this be related to b28613 ?
Can you reproduce this at will? I.e. if I follow these steps does the thumbnail always stick around?
Yes, and on a completely vanilla MediaWiki install. So it has nothing to do with networking and nothing to do with caching. At least this bug anyway. Can't say that about b28613.
Unless I'm missing something, Bug 28613 doesn't seem to be talking about removing old versions, though.
A test case: http://de.wikipedia.org/wiki/Datei:Logo_African_Pygmy_Goat.png uploaded a new image ("test copyvio") over this file. after reverting the second (copyvio) version had the thumburl: http://upload.wikimedia.org/wikipedia/de/thumb/archive/5/51/20110803213020!Logo_African_Pygmy_Goat.png/120px-Logo_African_Pygmy_Goat.png I also accessed this thumb: http://upload.wikimedia.org/wikipedia/de/thumb/archive/5/51/20110803213020!Logo_African_Pygmy_Goat.png/126px-Logo_African_Pygmy_Goat.png ------ then I did revision delete (hide) the file contents of the second version: Both above mentioned thumb URLs still work. → Bug. Thumb URLs which were not accessed while the file version was visible do not work (example: http://upload.wikimedia.org/wikipedia/de/thumb/archive/5/51/20110803213020!Logo_African_Pygmy_Goat.png/131px-Logo_African_Pygmy_Goat.png ) Even if I delete the file completely (I did temporarily) the archive thumbs still keep working. →Bug. Only the current version's thumbs do not work. However, in order to assess the severity of this bug: An "attacker" needs to know how mediawiki's thumb URLs for archive versions are constructed (those parts: archive , 20110803213020!) since the thumb URL is not anymore on the file's page (also not in file page's source code). And he needs to know the timestamp (easy to find out in the log or file page's html source). And even if a nerd did construct the correct thumb URL he can only access the thumbs which were generated before deletion. Typically this is only the 120px version which is tinytinytiny. Conclusion: * Speaking of copyvios this bug is not important. * Speaking of hard privacy violations this bug is important - I do not know how to get rid of the old thumbs. Maybe a server admin would need to delete them manually if a important privacy violation would happen. However - this would only matter if the privacy violation is in a non-current file version. Well, this easily happens if a vandal overwrites a file (preferably a file which is in high use) with a picture of his ex-girlfriend (or whatever). If the file is reverted then it is the non-current version. * All in all (due to the privacy problem) I think this is a bad bug which really should be fixed.
We'll try to get this fixed before the deployment of 1.18
An attacker could use this to redistribute an image. Upload image A.jpg. Make a thumbnail of it one pixel smaller than the original, 1023px-A.jpg. Overwrite it with a copyright violation. Record the archived thumb name. Complain about the copyright violation and ask for it to be deleted. A.jpg gets deleted, but 1023px-A.jpg remains at its archived URL, but is not in the database.
LocalFile::purgeHistory() and LocalFile::purgeCache() should call purgeThumbnails() on $this->getHistory().
Created attachment 8885 [details] Patch to purge old thumbnails from localfiles Like this?
That doesn't fix it for me. http://www.ra-tes.org/phase3/images/thumb/archive/e/e3/20110804215323!2011-07-13_17-05-24_808.jpg/120px-2011-07-13_17-05-24_808.jpg is still present even though http://www.ra-tes.org/phase3/index.php/File:2011-07-13_17-05-24_808.jpg has been deleted.
Yes, my question was just asking if such a patch would be what Bryan was talking about. I haven't even committed it, yet.
The patch seems to be what Bryan was suggesting.
Resolved in r94212. No need to duplicate the purging in purgeCache(), you end up calling purgeHistory() anyway :)
Created attachment 8920 [details] Uses twill to test filerepo functions.
Hrm. Doesn't pass my mediawiki automated filerepo test (uploaded. Run it on a wiki (modify the URL in the source) with your username and password on the command line. Yes, yes, I know, but these are test wikis with a junk password.) http://www.ra-tes.org/phase3/images/c/c6/Test-1313461285.08.png http://www.ra-tes.org/phase3/images/thumb/c/c6/Test-1313461285.08.png/120px-Test-1313461285.08.png http://www.ra-tes.org/phase3/images/archive/c/c6/20110816021516%21Test-1313461285.08.png http://www.ra-tes.org/phase3/images/thumb/archive/c/c6/20110816021516%21Test-1313461285.08.png/120px-Test-1313461285.08.png The first three 404, but the last one still works.
*** Bug 30535 has been marked as a duplicate of this bug. ***
I've got my head down in this code anyway because of SwiftMedia. I'll see if I can find it. It's kinda in my way anyway.
Ouch. You don't even have to delete the entire file. All you need to do is delete an archived version; the thumbnail isn't deleted.
@Chad H noted in an email on wikitech that these files need better function names and comments. He's right. Also, as a cleanup goal, I think functions only used in the one file should be marked private, and the API functions explicitly marked public.
There's two cases where thumbs should be deleted: 1) when a new version of the file is uploaded (to get rid of thumbs that belong to an archived version of the file), and 2) when a: 2a) current or 2b) archived version of a file is deleted. Part (2b) has no code to implement it right now. It's not that it's being done incorrectly; it's not even being attempted. I think the most appropriate locus for deleting stale thumbs is to delete them when a file is deleted. The other code which deletes current thumbs covers other cases and is still necessary.
(In reply to comment #18) > I think functions only > used in the one file should be marked private, and the API functions explicitly > marked public. > +1. I was thinking that too when I was looking at this earlier in the week.
There's also a mix of functions dedicated to *Archive* access, and OldLocalFile, which derives from LocalFile with functions that already know that they're accessing Archived files. A cleanup would eliminate the *Archive* functions ... but that also means creating a File any time you want to do anything to a file. And means calling RepoGroup::singleton()->getLocalRepo()->newFile($title); (with an additional time parameter if it's an archive file. And that means parsing the file at the ! to get the time to pass to newFile ... the complexity keeps piling up, all to simplify? I'm not so sure, but I also don't like the lack of regularity and lack of information hiding.
Created attachment 8992 [details] Seems to fix the bug This patch fixes the bug, and adds some useful helper functions to filerepo/File. It also recognizes that sometimes "$suffix" is really "$archiveName" and isn't in fact optional. I'm uploading it here rather than fixing it directly because I haven't tested it on the trunk wiki code yet. It works on both SwiftMedia (once I check in my patches), and an older slightly pre-1.17 on which I discovered the bug.
Created attachment 9021 [details] Whitespace issues fixed Patch looks alright, and seems to fix the deletion of archived thumbnail images It does however, leave the "20110411224401!Blah.png" folder, which IMHO, if we're deleting the files, hence making the directory empty, we should then delete the directory
Patch applied with minor tweaks in r97373.
Make that r96373. Oh, and Roan noticed that I'd changed the API to getArchive{Url|Path}. I changed it back in r96399
(In reply to comment #25) > Make that r96373. > > Oh, and Roan noticed that I'd changed the API to getArchive{Url|Path}. I > changed it back in r96399 Credit where credit is due: our automated test runner (CruiseControl) noticed this :)
tagging bugs for Marcus to look at