Last modified: 2010-05-15 16:03:39 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T17863, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 15863 - Upload: can't save file names with special characters to ntfs filesystem
Upload: can't save file names with special characters to ntfs filesystem
Status: RESOLVED DUPLICATE of bug 1780
Product: MediaWiki
Classification: Unclassified
File management (Other open bugs)
1.13.x
PC Windows Server 2003
: Normal normal (vote)
: ---
Assigned To: Chad H.
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-10-06 12:56 UTC by Paolo Benvenuto
Modified: 2010-05-15 16:03 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Paolo's patch (450 bytes, patch)
2008-10-06 16:07 UTC, Platonides
Details

Description Paolo Benvenuto 2008-10-06 12:56:48 UTC
I'm using mediawiki on a Windows Server 2003.

When I upload a file and I tell mediawiki to store it with a file name with special characters (i.e. accented characters, like à, é, or ñ, etc.), the file is stored in a wrong way: à -> A with a ~ + ï (i think).

It seems a utf-8 to iso-8859-1 (or the contrary) stuff.

I think it's because ntfs stores file names with iso-8859-1 charset, so that when mediawiki passes the file name in utf-8 charset, ntfs interprets it as a iso-8859-1 string.

Experimenting on my own with uploading and saving a file from php page, I found a solution:

In the case that the upload ends with the instruction

copy ( $tempfile , $filename ) ;

you should change it into

copy ( $tempfile , utf8_decode ( $filename ) ) ;

That seems to eliminate the problem on Windows Server 2003.
Comment 1 Chad H. 2008-10-06 14:30:03 UTC
Confirmed in trunk. Also, where did you find:

> copy ( $tempfile , $filename );

Can't find this in trunk :)
Comment 2 Platonides 2008-10-06 16:07:42 UTC
Created attachment 5393 [details]
Paolo's patch

It's at FileStore.php, the space is not in trunk.
I'm attaching it as a patch, but I'm sure utf8_decode would need to be added on other places as well. filerepo/FSRepo also does several actions directly on the filesystem, thumb.php...
Comment 3 Fran Rogers 2008-10-06 16:46:04 UTC
The utf8_decode solution wouldn't work - this function only converts to ISO-8859-1, and all filenames in non-Latin scripts would be completely corrupted.

NTFS actually uses Unicode internally. The problem lies in PHP, which naïvely assumes all filenames use eight-bit strings... which it does on Unix, but Windows uses wide character strings, and a separate call, _wfopen(), is used to access Unicode filenames on Win32. Until PHP gains proper Unicode support (currently scheduled for right after porcine flight is achieved) the only solution I can think of is for MediaWiki to mangle non-ASCII characters in the filename in a predictable, round-trippable way.
Comment 4 Paolo Benvenuto 2008-10-06 17:50:59 UTC
(In reply to comment #1)
> Confirmed in trunk. Also, where did you find:
> 
> > copy ( $tempfile , $filename );
> 
> Can't find this in trunk :)

No, I supposed that the file is saved with some instruction similar to that, either could be a rename or something else.

I didn't submit a patch because I don't know well mediawiki's code.

Comment 5 Brion Vibber 2008-10-06 17:58:31 UTC
Duping to bug 1780

*** This bug has been marked as a duplicate of bug 1780 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links