Last modified: 2012-11-06 08:53:54 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 12896 - Spam Blacklist shouldn't be fooled by similar-looking Unicode characters
Spam Blacklist shouldn't be fooled by similar-looking Unicode characters
Status: REOPENED
Product: MediaWiki extensions
Classification: Unclassified
Spam Blacklist (Other open bugs)
unspecified
All All
: Normal enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/w/index.php?t...
:
Depends on:
Blocks: SWMT
  Show dependency treegraph
 
Reported: 2008-02-03 18:36 UTC by Max Semenik
Modified: 2012-11-06 08:53 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Max Semenik 2008-02-03 18:36:29 UTC
See the url above. By inserting a U+0EFF (.) instead of a normal dot, the user managed to link to a blacklisted site traditio.ru. And even after I attempted to fix this by explicitly adding this char to blacklist[http://meta.wikimedia.org/w/index.php?diff=863226], it does not seem to work [http://meta.wikimedia.org/w/index.php?diff=863233].
Comment 1 Victor Vasiliev 2008-02-03 19:00:48 UTC
Fixed in r30482
Comment 2 Aryeh Gregor (not reading bugmail, please e-mail directly) 2008-02-03 22:03:59 UTC
The fix seems a little narrow.  What's the underlying reason that the exploit worked?  U+0EFF can't be the only character that browsers will treat as a period in URLs.
Comment 3 Victor Vasiliev 2008-04-24 14:01:08 UTC
(In reply to comment #2)
> The fix seems a little narrow.  What's the underlying reason that the exploit
> worked?  U+0EFF can't be the only character that browsers will treat as a
> period in URLs.
> 

I think we need some form of UTF normalization.
Comment 4 Max Semenik 2008-12-21 21:11:09 UTC
Indeed, there are far more ways: http://meta.wikimedia.org/w/index.php?oldid=1319535

Unicode normalisation again.
Comment 5 Mike.lifeguard 2009-02-17 22:00:49 UTC
(In reply to comment #4)
> Indeed, there are far more ways:
> http://meta.wikimedia.org/w/index.php?oldid=1319535
> 
> Unicode normalisation again.
> 

I'm not really sure what I'm supposed to be seeing at that oldid.

That said, unicode normalization is really needed. We're doing so in some monitoring tools, but of course it's also needed in the blacklist as well.
Comment 6 Mike.lifeguard 2009-02-17 22:01:13 UTC
better summary
Comment 7 Aryeh Gregor (not reading bugmail, please e-mail directly) 2009-02-17 23:20:48 UTC
"Unicode normalization" is a poor term to use for the problem involved here, since all the characters involved are already normalized by the definitions of the Unicode standard (they're NFC, to be precise).  Adjusted summary.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links