Last modified: 2014-02-16 05:55:46 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T6312, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 4312 - Username of all whitespaces in German Wikipedia dump file
Username of all whitespaces in German Wikipedia dump file
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
All All
: Low normal (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
Blocks: 16660
  Show dependency treegraph
Reported: 2005-12-19 05:15 UTC by Tyler Riddle
Modified: 2014-02-16 05:55 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Tyler Riddle 2005-12-19 05:15:30 UTC
A username consisting of all spaces made its way into the German Wikipedia dump file. The article it 
happened on is at

Since the username field is not marked as space-preserving Parse::MediaWikiDump completely ignored 
its contents in this case. I have a feeling a username of all spaces is not supposed to be allowed to exist.

Comment 1 lɛʁi לערי ריינהארט 2005-12-19 06:36:11 UTC

If you go
and click on the "space" link
you will come to
there to
no email specified or emails from other users disabeled

The problem is known since August see

The user name contains
Unicode Character 'NO-BREAK SPACE - U+00A0
HTML Entity (decimal)   (hex)   (named)  
UTF-8 (hex) 0xC2 0xA0 (c2a0) %c2%a0 %C2%A0
is known already from

Changing the name would be an administrative task either at WP:DE or better at
all projects. I do not know the policy about this. Please clarify this at the
local wiki, via a mailing list as [Wikide-l], [Wikitech-l] etc. or via IRC at
irc:// .

Marking this bug as a duplicate of
bug 1524: usernames should use unicode whitelist is mentioned at
bug 2173 comment 3
bug 2173: Fatal error when removing an article with an whitespace title from the

best regards reinhardt [[user:gangleri]]

*** This bug has been marked as a duplicate of 1524 ***
Comment 2 Ævar Arnfjörð Bjarmason 2005-12-19 06:53:04 UTC
This isn't a duplicate of bug 1524, that deals with having a whitelist for
registered usernames, but this particular username also happens to break the XML
Comment 3 lɛʁi לערי ריינהארט 2005-12-19 06:58:23 UTC
Thanks Ævar! I did not read the second paragraph with the attention that would
be required. Please look what happens at

Please change the summary in order to reflect the new / major problem Thanks in
Comment 4 Nemo 2012-08-23 19:08:41 UTC
I don't understand, does this really break dumps?
Comment 5 Andre Klapper 2012-11-10 15:53:53 UTC
Also wondering. How to exactly reproduce that it "breaks dumps"?
Comment 6 Tyler Riddle 2012-11-11 15:54:45 UTC
If the XML schema indicates data is not white space preserving then white space is not significant and there is no difference between " ", "  ", "   ", "\t\n\n\n\t\t\t\t\t\t\t\t\t \n\n]n" etc.

If a user name exists where white space is significant it becomes impossible to transmit using a non-space preserving data type. Thus it's not actually possible to get the user names correctly and this is rather broken.

Note You need to log in before you can comment on or make changes to this bug.