Last modified: 2011-11-30 16:42:47 UTC
Normally UTF-8 in URLs looks good with printable=yes, e.g., http://radioscanningtw.jidanni.org/index.php?title=%E5%8F%B0%E6%8E%83:General_disclaimer&printable=yes But not when {{fullurl}} is involved, e.g., http://radioscanningtw.jidanni.org/index.php?title=%E5%8F%B0%E5%8D%97%E7%B8%A3%E6%B6%88%E9%98%B2%E5%B1%80&printable=yes There they are printed as % escapes instead of UTF-8. (Why I use {{fullurl}} is to discourage editing categories as I discussed elsewhere.)
Please describe "with printable=yes" and "when {{fullurl}} is involved".
The issue appears to be that a link like http://www.foo.example/免 has "免" printed in the source href attribute as a Chinese character, but a link like {{fullurl:免}} has 免 mangled to the escaped form, "%E5%85%8D". (The printable display aspect is just a symptom.) I've confirmed this is true in trunk.
Er, there's a reason we escape these things. The reason we *don't* do it in the printable form of pages is because it's usually safe enough for the user to type the proper character as a URL directly. It's also a damn sight prettier. In all likelihood, the reason it doesn't happen with {{fullurl}} et al. is because those operations are run before whatever code it is that un-escapes certain URL components.
This is compatible URL/URI encoding of a UTF-8 IRI. Some day when everyone's using fully IRI-compatible browsers, we may make all URLs display in pretty UTF-8 (but keep in mind that can make many URLs impossible to type).
I understand this is just about the readability of the generated output, so typing URLs isn't at all what this is about. What is sometimes hard to communicate to people whose language makes only use of 7-bit ASCII characters is, that people whose language uses an extended set of characters are very well capable of entering 免, Wikipédia, or Füße, and that 免, Wikipédia, or Füße is way more readable than %E5%85%8D, Wikip%C3%A9dia, or F%C3%BC%C3%9Fe. Since these sinister characters work fine in normal links there is no technical limitation why fullurl would need to return these characters escaped. And for browser usage, IRI or not, there is fullurle with properly escaped characters if I remember the docs correctly.
I just tested this, and it seems to me that this issue has been FIXED.