Last modified: 2011-03-13 18:06:36 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T16941, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 14941 - Username Blacklist creates huge regexes, which could unintentionally crash PCRE
Username Blacklist creates huge regexes, which could unintentionally crash PCRE
Status: RESOLVED WONTFIX
Product: MediaWiki extensions
Classification: Unclassified
UsernameBlacklist (Other open bugs)
unspecified
All All
: Lowest major (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-07-27 02:55 UTC by Fran Rogers
Modified: 2011-03-13 18:06 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Replaces the 'mega-regex' approach with one regex for each blacklist entry (1.14 KB, patch)
2008-07-27 02:55 UTC, Fran Rogers
Details

Description Fran Rogers 2008-07-27 02:55:55 UTC
Created attachment 5097 [details]
Replaces the 'mega-regex' approach with one regex for each blacklist entry

Recently the sysadmins disabled the UsernameBlacklist extension on WMF wikis because it was causing account creation to crash the servers, due to some complicated regexes in place on en.wiki. Someone pointed out on wikitech-l that it was odd that TitleBlacklist. with even more complicated entries on en.wiki, didn't cause the same problems, and noted that the only real difference in how they worked is that the UsernameBlacklist concatenated all the blacklist entries into one "mega-regex."

Looking at the code, I think this 'mega-regex' is the root of the problem. The function UsernameBlacklist::safeBlacklist() in UsernameBlacklist.php concatenates every single entry in the blacklist into one regular expression, which it then passes to preg_match(). However, PCRE's documentation notes that as regexes are processed by a recursive function in the C library, which could potentially cause a stack overflow and crash the process - in this case, Apache/PHP. I surmise this is the problem that we've been having - multiple complicated blacklist entries are combined by UsernameBlacklist into a huge regex which causes PCRE to overflow when parsing.

I've attached a patch for UsernameBlacklist that replaces the "mega-regex" approach with one regex for each blacklist entry, the same approach taken by TitleBlacklist.
Comment 1 Nobody 2008-08-02 04:25:03 UTC
Fix included in patch submitted in 15010
Comment 2 Fran Rogers 2008-08-09 02:41:46 UTC
I'm working on replacing UsernameBlacklist with an extended TitleBlacklist, so marking this WONTFIX.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links