Last modified: 2010-01-04 08:30:30 UTC
I asked the following task (3 username renames - a swap) from the local bcrat: 1. ഉപയോക്താവ്:കമ്പ്യൂട്ടര് -> ഉപയോക്താവ്:കമ്പ്യൂട്ടര് temp 2. ഉപയോക്താവ്:WOPR -> ഉപയോക്താവ്:കമ്പ്യൂട്ടര് 3. ഉപയോക്താവ്:കമ്പ്യൂട്ടര് temp -> ഉപയോക്താവ്:WOPR Step 1 failes with the error: The username "കമ്പ്യൂട്ടര് temp" is invalid Related local wiki discussion: http://ml.wikipedia.org/wiki/%E0%B4%89%E0%B4%AA%E0%B4%AF%E0%B5%8B%E0%B4%95%E0%B5%8D%E0%B4%A4%E0%B4%BE%E0%B4%B5%E0%B4%BF%E0%B4%A8%E0%B5%8D%E0%B4%B1%E0%B5%86_%E0%B4%B8%E0%B4%82%E0%B4%B5%E0%B4%BE%E0%B4%A6%E0%B4%82:Vssun#My_bot
Created attachment 4079 [details] chillu (Malayalam) Malayalam language characters don't work well with mediawiki (chillu problem)
Look at the above attachment the last character make problems
I still cant use the following username... Any update on this bug? http://ml.wikipedia.org/w/index.php?title=%E0%B4%89%E0%B4%AA%E0%B4%AF%E0%B5%8B%E0%B4%95%E0%B5%8D%E0%B4%A4%E0%B4%BE%E0%B4%B5%E0%B5%8D:%E0%B4%95%E0%B4%AE%E0%B5%8D%E0%B4%AA%E0%B5%8D%E0%B4%AF%E0%B5%82%E0%B4%9F%E0%B5%8D%E0%B4%9F%E0%B4%B0%E0%B5%8D%E2%80%8D&redirect=no
The username in comment #3 has a U+200D Zero-Width Joiner control character at the end. This is in a blacklisted control character range, and is not currently allowed in usernames. It also looks totally incorrect, seeing as how it comes at the end of a name, not really a valid place for one even if it was allowed. Remove that last character and it should be accepted (confirmed on a local installation.)
Created attachment 4740 [details] From http://ml.wikipedia.org/wiki/%E0%B4%9A%E0%B4%BF%E0%B4%A4%E0%B5%8D%E0%B4%B0%E0%B4%82:Revision_history_of_computer.jpg I have added an attachment which is a copy of http://ml.wikipedia.org/wiki/%E0%B4%9A%E0%B4%BF%E0%B4%A4%E0%B5%8D%E0%B4%B0%E0%B4%82:Revision_history_of_computer.jpg This demonstrates why the invisible control character is necesary.
Could you elaborate why is "U+200D Zero-Width Joiner control character at the end" blacklisted? Many Malayalam words need this at the end to represent their meaning. I can't see how we can say mediawiki software is fully unicode compliant without this support. So I would very much like to hear about the technical concerns.
Well, the fact that it's invisible and hard to cut-n-paste makes it a bit tricky to manage. :P We've generally forbidden most magic invisible chars from usernames for security (spoofing etc) purposes.
Right and thats a good practice on all usernames that are not in Malaylam. In the case of Malaylam that is a different story. Perhaps a solution is to let bureaucrats rename users to these 'invisible' characters while banning users from creating accounts with such characters. That way vandals won't be able to abuse this and good users would benefit from it. I believe the validity check (weather a username is valid or not) used by new username creation and username rename is conducted by the same block of code.
Please hold from making any fixes. This seems is part of a much bigger issue with Malayalam Unicode. I am withdrawing my vote for now.
Fixed in r60599.