Last modified: 2014-09-02 18:03:50 UTC
The handling of usernames should be case-insensitive in core MediaWiki. This means, possibly among other caveats: * 'QWERTY' and 'qwerty' and 'Qwerty' should be considered the same username for the purposes of account creation and logging in * Titles in the User: namespace should be case-insensitive like in the Special: namespace (related to bug 453) * Usernames starting with a lowercase first letter should be permitted (bug 1574 / bug 26396), and preferably users should be able to change the casing on already existing accounts themselves
Quoting bug 61416, which started this discussion: (Jon from bug 61416 comment #19) > FWIW I think all usernames should be case insensitive and I think it's a > terrible design oversight that we support case sensitive usernames. The > username Jdlrobson should be the same as jdlrobson and should be the same as > JdlRobson in a username lookup. I think this is a huge bug (I'm not sure if > it has previously been raised). (MZMcBride from bug 61416 comment #22) > Jon: you're raising perfectly valid points, but they're outside the scope of > this bug report. If you're interested in a user.user_display_name field or > in case insensitivity of usernames, please file separate bug reports, if > such bug reports don't exist already.
However (for anyone unfamiliar), there are already clashing accounts which are exactly the same other than case. So for existing wikis, this requires a way to resolve those conflicts (and different wikis may want different rules for this). That means it probably has to be a config option, potentially defaulting to case-insensitive for new wikis. For the WMF specifically, *if* we want to transition to this, we should probably do so at the same time as single login finalization (https://meta.wikimedia.org/wiki/Single_User_Login_finalisation_announcement). That way's there's only one huge "lots of usernames changed" event. However, bear in mind this would increase the complexity of the SUL finalization process. Also, case-insensitive does not mean the same thing in every language, and it's not the same as "just lower case them both then compare them".
(In reply to Matthew Flaschen from comment #2) > However (for anyone unfamiliar), there are already clashing accounts which > are exactly the same other than case. > > So for existing wikis, this requires a way to resolve those conflicts (and > different wikis may want different rules for this). That means it probably > has to be a config option, potentially defaulting to case-insensitive for > new wikis. This naturally can be the case, and we will naturally have to provide such a upgrade path for wiki owners (and yes, probably keep the case-insensitivity off by default, or only enable it if we can confirm there are no conflicts somehow). Still I'd risk a guess that this won't be a major issue – especially on WMF sites, as these have the AntiSpoof extension enabled, which blocks the creation of such conflicting usernames already. > Also, case-insensitive does not mean the same thing > in every language, and it's not the same as "just lower case them both then > compare them". Well, I think it actually is, but it's the lowercasing part that's difficult :) We have luckily solved the issue already in our Language class, so regular wikis won't have any issues (just use the wiki content language for comparison). We'll have to come up with something cleverer for multi-language farms with shared users (like WMF wikis), though.
(In reply to Bartosz Dziewoński from comment #3) > > Also, case-insensitive does not mean the same thing > > in every language, and it's not the same as "just lower case them both then > > compare them". > > Well, I think it actually is, but it's the lowercasing part that's difficult > :) No, there are cases where lowercasing them both them comparing is probably not the desired behavior. For example: mb_strtoupper( 'ς' ) === mb_strtoupper( 'Σ' ) === 'Σ' Thus, most people would probably consider 'ς' and 'Σ' equal ignoring case. However: mb_strtolower( 'ς' ) -> ς mb_strtolower( 'Σ' ) -> σ so a simple lower-case (even a multi-byte one), then a binary compare will not work.
Yes, the right way is to 'casefold' them and then compare. In core, we have a Language#caseFold method, used in a few places, which seems to just uppercase the string – I'm going to assume that this is the proper way to do this, and if it's not for some reason, please file a bug or submit a patch. :)
Making this non-optional could break a lot of third party use cases. If a wiki has a SSO with a forum or blog or whatever which allows EXAMPLE and Example to exist as separate users, which at best massively complicates SSO code with handling for clashes and determining which user to allow plus stops some users from signing into the wiki, and at worst could let users block eachother from the wiki by creating duplicate accounts and registering first due to SSO code with first come first served policies, or even open the door for taking admin credentials by duplicating an account if the SSO is custom by someone who does not know how to create the complex checks on usernames. Similar issue to lowercase first usernames in general really, but considerably more likely to come up, and ideally that would get fixed rather than added to.