Last modified: 2013-09-04 05:00:49 UTC
Until now Special:ListUsers sorts Users in this way: AAA, AAB, ..., ABC, ..., AZZZZZZZZZZZ,AAa,AAb,...AZZZZZZZZZZZZz,...,AZzzzzzzzz,...,Aaa,... In short: the aaa-Usernames, differing only in upper-/lowercase-letters appear in 2^3=8 total different places of the log. It should be AAA,AAa,AaA,Aaa,... The ranking depending on upper-/lowercase-letters makes it nearly impossible for Oversights to do a efficient search for libelous Usernames: The vandal only has to create the same name over and over again with different combinations of upper-lower-case letters and the bureaucrat/oversight has to search in dozens of places in the logfiles.
This looks like a duplicate to the collation bug 164, because this change the sortorder of the items. The current behaviour is the same as for Categories etc.
(In reply to comment #1) > This looks like a duplicate to the collation bug 164, because this change the > sortorder of the items. The current behaviour is the same as for Categories > etc. I would call them separate bugs, since the solution currently being worked on for category pages will not fix this bug. (afaik)
They are indeed separate bugs, but this fixing this one would require disproportionate resources for marginal benefit. Suggest WONTFIX.
I doubt that the benefit is marginal: It would it make a lot lot more easier for Oversights/Stewards to look for libelous/harassing Usernames to get rid of them. Could you please elaborate why this fixing requires 'disproportionate resources'? I do not know the actual implementation in mediawiki, but in every programming language i recently worked in there are sort-algorithms which do not make a difference between 'A' and 'a', so i guess there should be one for php. Anyway: People don't expect this sorting order. They are used to other orders from their phone-book or printed encyclopedias.
(In reply to comment #4) > I do not know the actual implementation in mediawiki, but in every programming > language i recently worked in there are sort-algorithms which do not make a > difference between 'A' and 'a', so i guess there should be one for php. > Yes, you can sort case-insensitively in PHP just fine, but you can't (efficiently) do so in MySQL. Like with the category thing, we'd have to add a new column to hold the 'normalized' username and sort by that instead. For English 'normalized' can just be all lowercase, but for other languages you'll want to sort accented characters in all sorts of interesting ways.
OK, didn't know that the sorting was done in the database. Thanks for the update. So probably a regexp search on the usernames could be helpful. Do you know if something like this exists (maybe even as an external tool)?
(In reply to comment #6) > OK, didn't know that the sorting was done in the database. Thanks for the > update. > So probably a regexp search on the usernames could be helpful. Do you know if > something like this exists (maybe even as an external tool)? Regexp searches can't be done 'internally', as you might have guessed. It doesn't exist as an external tool AFAIK, but it shouldn't be too hard to write a toolserver tool that executes wildcard queries (SQL does allow those, they're just kinda slow in most cases) on the database or uses a dump of all user names to run regexes on.
(In reply to comment #2) > (In reply to comment #1) > > This looks like a duplicate to the collation bug 164, because this change the > > sortorder of the items. The current behaviour is the same as for Categories > > etc. > > I would call them separate bugs, since the solution currently being worked on > for category pages will not fix this bug. (afaik) Did it? I also wonder how this interacts with bug 26396.