Last modified: 2010-05-15 14:36:24 UTC
If you go to Hymn_of_the_Russian_Federation page, and look for "Other languages" -> Russian (it looks like Pycckuu), if you hover over the link (or click it, for that matter), you'll see two "broken" characters for each russian one, presumably because Unicode is not translated properly. Notice now that if you edit the page, go to [[ru]] link and get rid of or change the #1056 character (first after the space) it now renders properly and when you click on it goes to the right place in the Russian wiki (except, of course it's wrong, since you've changed one character). The character in question is the russian uppercase R, which looks like english P. It is used in the text of the article itself many times and is rendered properly. I am using Mozilla 1.6 on Linux, but I also get the same effect in Safari 1.2.3 on MacOS X.
I don't see anything out of the ordinary in Safari 1.2.3. Could you provide some screen shots of the bug in action, and reference the exact revisions of the page that do and don't work?
I can confirm that I see this bug exactly as described. Something weird is happening there. The original inter-wiki link was: [[ru:Гимн России]] It produced this link: http://ru.wikipedia.org/wiki/%D0%93%D0%B8%D0%BC%D0%BD_%D0_%D0%BE%D1%81%D1%81%D0%B8%D0%B8 Notice that, in the middle, there is "_%D0_". This should instead be "_%D0%A0", because the Cyrillic capital letter Er is %D0%A0 in UTF-8. This means the bug is caused by the "%A0", which is a nonbreaking space in Latin-1 (but not in UTF-8), being replaced by a simple space (and hence, an underscore). Browsers (in my case, Firefox) then no longer recognise the link as being in UTF-8, interpret it as Latin-1, and so it comes out jumbled. Similarly, when you actually follow the link, the wiki software will notice that the link is not proper UTF-8, pretend it was Latin-1, convert it to UTF-8, and forward the user to a page that obviously doesn't exist.
Brion, you might find my above comment interesting, so I am adding you to the CC list. Please let me know if you do not want me to do this.
Replacing the inter-wiki link with [[ru:%D0%93%D0%B8%D0%BC%D0%BD_%D0%A0%D0%BE%D1%81%D1%81%D0%B8%D0%B8]] didn't help; the same bug occurs.
There was a patch lately to convert NBSPs to _. Some vandal created accounts with a nick with trailing NBSPs and articles with NBSPs in their name which were hard to block/delete. This probably has broken these links.
Tim's hacked it to avoid the non-breaking space check for interwiki links so it may be working now; please check. I'm not sure this is fully in place as it's changed only in REL1_3, so I'm not marking as FIXED yet. Timwi, don't bother CC'ing me, as I get and read *all* bugmail via wikibugs-l. :)
Has this been fixed in CVS or is it still there?
(In reply to comment #7) > Has this been fixed in CVS or is it still there? It appears to be fixed. The example given in comment #2 gives the correct output, as I understand it; see http://test.wikipedia.org/wiki/Bug168