Last modified: 2014-11-17 10:35:11 UTC
BUG MIGRATED FROM SOURCEFORGE
Originally submitted by Toby Bartels (tobybartels) 2004-03-16 03:04
On the English Wikipedia in 2001, there was a user
"Ryan_Lackey" whose user name contained an underscore.
You can see edits credited to this user at, for
example, [[Talk:Sealand]]. But these edits are not
which, after all, ''should'' be for a user named "Ryan
Lackey" (who doesn't exist). Similarly,
[[User:Ryan_Lackey]] doesn't think that it's a user
page for an actual user.
This specific case can probably be fixed if a developer
performs a username change (from "Ryan_Lackey" to "Ryan
Lackey") -- assuming that the name changing feature
doesn't break down too! ^_^ But the larger bug probably
applies to other editors from Phase I.
Not a bug...underscores are seen as spaces, and so aren't supported in
usernames. Running the username change *might* do it, but it might break their
contribs, too, depending upon how they're linked. Depends.
It's not a bug in the current MediaWiki software, since that software doesn't
allow underscores in usernames. Instead, it's a bug in the English Wikipedia
website, and possibly other websites that used Phase I. (It may also be a bug
in the Phase I software or in the Phase I -> II conversion script, but I don't
think that they matter anymore.) I've fixed the Product field to indicate this.
It's still an open report /bug
[special:Contributions/Ryan_Lackey]] is still empty.
(In reply to comment #1)
> Not a bug...underscores are seen as spaces, and so aren't supported in
> usernames. Running the username change *might* do it, but it might break their
> contribs, too, depending upon how they're linked. Depends.
Yes, it would. I tried manually switching a user name to have _ instead of a space, and it broke (same issue had come up with one of my users, wanted the underscore).
Too many places where _ is getting stripped or un-stripped possibly.
These still haven't been cleaned up . . . for instance, user_id 87496 is Nicholas_Lativy. [[User:Nicholas Lativy]] is a different user, 90574. These should be identified and dealt with, although they're probably long abandoned.
Removed shell keyword as it seems there's nothing to do on shell presently.
The most famous victim of this bug would have to be Larry Sanger. See this diff (and note the username of "Larry_Sanger"):
However, there are no contributions for Larry Sanger from the talk namespace in April 2001:
This bug affects the Nostalgia Wikipedia in exactly the same way.
I've just created the account under the user name "Ryan Lackey" to keep the userpage from being deleted by automated scripts who think it's a userpage of an unregistered user. This will also stop malcontents from trying to use the account.
So what actually needs to be done here? What needs to be "cleaned up", and where?
What needs to be done? My estimate:
* Identify all of the wikis that ever used Phase I software.
* Identify all of the characters forbidden in current MediaWiki usernames but allowed in Phase I.
* Identify all of the users registered at those wikis with at least one of those characters in each name.
* Find an appropriate alternative name (which probably needs to be done ad hoc; we know that [[User:Larry_Sanger]] was the same as [[User:Larry Sanger]], and we know that [[User:Ryan Lackey]] is a dummy account, but we don't know what's up with [[User:Nicholas_Lativy]] and it may be too late to ask).
* Move the invalidly named account --possibly by hand-editing the database-- to the validly named account.
This is a lot of work for little reward, so maybe we just need to keep this bug open (or is WONTFIX for this sort of thing?) so that people know about the possibility. And try not to let anything else interact badly with it.
In the revision table, all underlines in the rev_user_text field need to be changed to spaces. Ditto for the ar_user_text field in the archive table. I think those changes will completely solve the problem, but I'm not 100% sure ... I'm not an expert on the database schema. It would be nice if the user IDs in the revision table were changed as well (so the user ID of Larry_Sanger would be the same as the user ID of Larry Sanger).
I comment-conflicted with Toby there. :-) As he said, other special characters caused problems when used in phase I usernames as well. The only one I can think of is "@", which is replaced with ".", as in this edit: http://en.wikipedia.org/w/index.php?title=Wikipedia:UuU&oldid=291430. That problem would be harder to fix though.
Re-added hsell keyword as fixing this bug requires direct manipulation of the database.
I just found a case of this bug where the user had an underline in their name but the account was subsequently taken over by a vandal. I thought I had created accounts for all UseModWiki-era users who didn't have them, but users have occasionally slipped through the cracks. See:
I have also known for a long time about the case of "Simon_J_Kissane", see:
Are we supposed to list all cases we find (as in [[bugzilla:20757]])? Because I just ran across [[User:Alan_D]]: http://en.wikipedia.org/w/index.php?title=Sailor_Moon&oldid=282114 for example does not show up in his contributions.
@ Philip #15
I don't know who decides what we're "supposed" to do, but I think that it would be a good idea, at least until a developer writes in to say that there's no point.
There's no point in listing them as far as I can tell. The devs can find them all automatically if they use the method I outlined in comment 11. I don't see the point of listing all instances at bug 20757 either, but it's better to be safe than sorry.
As an aside, to draw more attention to this bug, I've mentioned it at http://en.wikipedia.org/wiki/Wikipedia:MediaWiki/DeveloperMemo/November2009#Requests_-_fixes
There is no point in listing them one by one here. Anyone with even toolserver access can just query the appropriate tables to find the bad rows. E.g., on enwiki,
mysql> SELECT user_name FROM user WHERE user_name LIKE '%\_%';
| user_name |
| Nicholas_Lativy |
1 row in set (1 min 13.18 sec)
The same can just as easily be done for the other wikis, and other tables.
Curiously, the import feature seems to convert underlines to spaces in usernames automatically. I just imported some history from Nostalgia Wikipedia to the English Wikipedia, thanks to bug 20280. Larry Sanger's early contribution list, especially before January 2002, is now quite interesting:
Not really relevant to this bug, but importing such edits also causes diff sizes to be generated for them.
Re: Comment 12, the problem is not the at being changed to a dot, but the fact that the first letter of the username contains a lower-case letter. I've changed the bug name accordingly to take this into account.
Therefore I would consider this bug resolved if someone changed underlines to spaces in the username fields as described in comment 11, then used the same procedure to change initial lower-case letters in usernames to capital letters. The change in the user ID number would be nice, but not strictly necessary, and it would probably be more trouble than it's worth.
And it goes without saying that I'd like this bug fixed on all applicable wikis, not just the English Wikipedia. I'm particularly thinking about the Nostalgia Wikipedia here, but other WMF projects might be affected as well.
What WMF projects besides the en.wp were active back on the Phase I software, anyways?
(In reply to comment #23)
> What WMF projects besides the en.wp were active back on the Phase I software,
Plenty of them. Compare http://meta.wikimedia.org/w/index.php?title=Wikipedia_software_upgrade_status&oldid=2478 and http://en.wikipedia.org/w/index.php?title=Wikipedia:Complete_list_of_language_wikis_available&oldid=353094 ... that's only the Wikipedias.
It occurs to me that it might be easier to fix this bug by changing the Special:Contributions and deleted contributions pages to check for table rows with underlines and initial lower-case letters in the usernames.
Another relevant page to the previous comment is: http://meta.wikimedia.org/w/index.php?title=Wikipedia_software_upgrade_status&oldid=48204
This bug also affects some usernames from the Phase II software (which was used in the English Wikipedia from January to July 2002), so I've changed the bug title accordingly. See this edit to "military history":
(In reply to comment #24)
> It occurs to me that it might be easier to fix this bug by changing the
> Special:Contributions and deleted contributions pages to check for table rows
> with underlines and initial lower-case letters in the usernames.
And it now occurs to me that fixing the problem by changing the contributions special pages, rather than changing the entries in the database, wouldn't fix the problem with importing edits in comment 19. See this page in my userspace:
Therefore my idea in comment 25 would be a second-rate solution.
Here's an example of this bug in a non-English Wikipedia: http://it.wikipedia.org/w/index.php?title=Karl_Pearson&oldid=4015
In the revision table of the Nostalgia Wikipedia, one of the usernames listed is "Brad_", so it was apparently possible for usernames to end in underlines in the phase I and II software.
In these cases, these usernames should probably be changed to "Brad old" or something similar. Replacing the underlines with spaces in this case would produce the username "Brad ", and the space at the end would still make the username invalid.
At the moment, I'm creating English Wikipedia accounts for all usernames that existed in the Nostalgia Wikipedia. Therefore, almost all of the usernames affected by this bug in the English Wikipedia will have a dummy account associated with them.
I've found some edits where the username is stored in the database with two consecutive spaces. None of these edits can be found through the user contributions list. I have changed the bug summary accordingly. In this diff, the extra space is not apparent when looking at the page in a browser, but it is obvious when checking the HTML source code:
Here is an example from Meta of a username with a lower-case letter from the Phase II software:
I've also changed the bug summary to be more informative.
Another example from Meta: http://meta.wikimedia.org/w/index.php?title=Talk:Logo_suggestions&action=history ([[:m:User:sodium]]).
Another example from mirwin on Meta: http://meta.wikimedia.org/w/index.php?title=Draft_mission_statements_for_various_types_of_organizations&action=history
The strange thing is that API find his contributions: http://meta.wikimedia.org/w/api.php?action=query&list=users&ususers=mirwin&usprop=editcount|registration gives an editcount of 79, http://toolserver.org/~vvv/yaec.php?user=mirwin&wiki=metawiki_p finds also 35 deleted contributions.
Hmmm, this is probably due to the facte that the rev_user field is non-zero for each of the edits listed in those two links, and in fact is linked to the user ID of the user who made the edit; this never happens in the English Wikipedia, so these methods cannot be used there. The rev_user field shows the user ID of the editor who made a particular edit; the equivalent field in the archive table is ar_user. The user ID for an edit is always 0 for anonymous editors, mass-imports and scripts; it isn't usually zero for normal registered users. If the user ID given for an edit made by a registered user is 0, then the "contribs" link won't show up for the user in the page history. This example comes from a mistaken import, but it is illustrative:
No contribs are found for Ryan_Lackey (see top of bug report) in the API of the English Wikipedia, because none of his edits have an associated non-zero user ID:
On the examples from Meta: note that in the history (and also Special:undelete) the links to user page a user talk are red even if the pages actually exist (I'm adding also a screenshot for future reference).
Created attachment 7634 [details]
Red link to existing lower case user and user talk pages
See bug 323 comment 35.
I've changed the summary once again, so it shows the correct fields!
Thanks to [[it:User:Mauro742]] you can now find the complete list of all 4336 en.wiki affected revisions at [[User:Nemo_bis/Bug 323 revisions]].
Some edits of renamed users are affected, too, and have not been moved to the new username: compare http://meta.wikimedia.org/w/index.php?title=Special:Undelete&target=Native+American+Affairs×tamp=20020127003202 by [[m:user:maveric149]] (lowercase: see also [[m:Special:Contributions/maveric149]] which for some reason is not empty) and http://meta.wikimedia.org/w/index.php?title=Wikimedia_bank_account_history_for_2004&action=history which was created after the user was renamed to Daniel_Mayer (http://meta.wikimedia.org/w/index.php?title=User:Maveric149&diff=1352761&oldid=157121) and is now under the correct username Mav (I've just restored this page).
See also bug 3507, dealing with the usernames themselves instead of edits attributed to those users.
deblocking from 29757, these have nothing to do with user renames, they are caused from user accounts predating phase3 (aka mediawiki as we know it today)