Last modified: 2005-07-23 06:41:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3972, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1972 - Serve files as UTF-8
Serve files as UTF-8
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Bugzilla (Other open bugs)
unspecified
All All
: Normal minor (vote)
: ---
Assigned To: Nobody - You can work on this!
http://bugzilla.wikimedia.org/attachm...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-04-25 10:16 UTC by Ævar Arnfjörð Bjarmason
Modified: 2005-07-23 06:41 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ævar Arnfjörð Bjarmason 2005-04-25 10:16:54 UTC
It's annoying to have to manually switch settings when viewing attachments.
Comment 1 Ævar Arnfjörð Bjarmason 2005-04-27 21:44:46 UTC
Actually, they aren't being served with any specific character set, changing the
summary to reflect this.

""""
$ printf "GET /attachment.cgi?id=455&action=view HTTP/1.0\nHost:
bugzilla.wikimedia.org\n\n"|nc bugzilla.wikimedia.org 80|head
HTTP/1.1 200 OK
Date: Wed, 27 Apr 2005 21:42:38 GMT
Server: Apache/1.3.29 (Unix) PHP/4.3.11
Content-disposition: inline; filename="LanguageCs_1.5.php"
Content-length: 104226
Connection: close
Content-Type: text/plain; name="LanguageCs_1.5.php"

<?php
/** Czech (česky)
"""

Regardless, it would be good to explicitly serve them as UTF-8.
Comment 2 River Tarnell 2005-04-28 02:37:48 UTC
i'm not clear what you want to do here.

do you want to set charset=UTF-8 for every (text) attachment served?

or, do you want to auto-convert text files to UTF-8 on upload, and then set
charset=UTF-8?

if the latter, this should probably be reported as a BugZilla enhancement request.
Comment 3 Brion Vibber 2005-04-28 02:43:57 UTC
The uploaded patches are already in UTF-8; they're just not being sent with a charset in the Content-type header.

Bug 609 describes the equivalent issue with bugmail.
Comment 4 River Tarnell 2005-04-28 02:45:01 UTC
yes, but you can't assume all files will be UTF-8, so you either send the wrong
encoding with some files, or you need to convert them as needed, or somehow
otherwise detect the encoding to send.
Comment 5 Ævar Arnfjörð Bjarmason 2005-04-28 02:57:44 UTC
(In reply to comment #2 and comment #4)

I want to set charset=utf-8 for every text attachment served.

Practically speaking the only attachments we get with characters that are not in
ASCII are patches for Language files, and since we'll be going all-UTF-8 in 1.5
these are going to be in UTF-8. There's really no need to make some 100% correct
character set detection system (and AFAIK such a thing isn't even possible),
serving them all as UTF-8 is good enough for our purposes.
Comment 6 Zigger 2005-07-23 06:41:24 UTC
Resolving as FIXED sometime past.  Current content-type response header for the
example is:

Content-Type: text/plain; name="LanguageCs_1.5.php"; charset=UTF-8

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links