Last modified: 2014-09-09 17:41:06 UTC
Currently, the UploadWizard accepts an upper case file name extension (i.e. JPG), does not change it but neither allows the user to do so (see bug 34703). Therefore, it is possible to upload two files, one ending in .jpg and the other .JPG. However, elsewhere on Commons, such as in the RenameLink gadget or the tool used by filemovers, .jpg and .JPG are considered the same and the tools won't allow renaming between these two variants, therefore making it impossible to repair consistency if a series of pictures have been uploaded using one case with only one or a few of them accidentally using another case. All these tools seem to share a common "file name sanitization" module, as reported at http://commons.wikimedia.org/wiki/User_talk:Rillke#RenameLink_forces_case_in_file_extension As evidenced at https://commons.wikimedia.org/wiki/User_talk:Blahma#File:Brno.2C_T.C3.A1bor_15.jpg file movers need to rename files manually if they are requested to change letter case in file extension, such as renaming "foo.JPG" to "foo.jpg". File names should be treated equally everywhere on Commons, so if file name extensions get "normalized" by file moving tools, the same "sanitization" should be performed in the UploadWizard.
There are some checks in place, but they only work if the existing file has the normalized extension, i.e., a warning will be given if you try to upload X.JPG and X.jpg exists, but not the other way around. It is probably a good idea to just give a warning if the same filename exists with any extension. That would need support in FileRepo to search files without extension. I'll have a look at it.
I21eddc5d
Thank you, Bryan, for your quick response and act. It would inded be nice to see a warning when a similarly named file already exists. However, this does not solve the problem completely, because it means that the Upload Wizard will go on stating that "file names differing only in extension case are possible", while the tools that might be used subsequently (file renaming) state "file name extensions must be normalized as if they were case-insensitive". This means that confusion will persist, unless a consensus on this is found across Commons. Perhaps we should invite more people into this discussion? The easiest solution is, obviously, to equip the Upload Wizard with the same "sanitization" mechanism which is already used by the file mover, but I understand that this is not something what you are willing to do at the moment, am I right?
I'm not aware of any software restrictions on file moving that perform the normalizations you describe. As far as I am aware it is possible to move a file to A.JPG if the file A.jpg already exists.
I have found the "cleanFileName" function from http://commons.wikimedia.org/wiki/MediaWiki:Gadget-AjaxQuickDelete.js being called as a part of https://commons.wikimedia.org/wiki/MediaWiki:RenameRequest.js which holds the code for the RenameLink gadget. Indeed, when you are not a file mover, visit a file page and click the "Move" tab, the gadget's dialog appears and there if you change the value in the "Enter the new name" field to "A.JPG" and leave that field by focusing another one, cleanFileName gets called and the field's value is automatically normalized to "File:A.jpg". Yes, I could insert the Rename template manually, but the gadget seems to be more efficient. And, I am not a filemover so I cannot check myself, but User:Taketa has suggested at http://commons.wikimedia.org/w/index.php?title=User_talk%3ABlahma&diff=77628385&oldid=77600880 that the same normalization occurs in the actual filemoving interface (and is the cause of .JPG and .jpg considered "identical"). Could someone please recheck this?
(In reply to comment #2) > I21eddc5d Assigning bug to author ( of Gerrit change #24124 ), +patch-in-gerrit
Is this fixed? The patch got merged.
(In reply to comment #1) > There are some checks in place, but they only work if the existing file has > the > normalized extension, i.e., a warning will be given if you try to upload > X.JPG > and X.jpg exists, but not the other way around. > > It is probably a good idea to just give a warning if the same filename exists > with any extension. That would need support in FileRepo to search files > without > extension. I'll have a look at it. People seem to want to be able to do that though. See bug 46741
Looks like this is just about done - I don't think it's necessary to change the filenames to have the same extensions in UW. I got the warning reliably by uploading a .jpg and .JPEG with the same name.