Last modified: 2011-03-13 18:06:03 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 14522 - Antispam filter doesn't filter plaintext rendered URLs
Antispam filter doesn't filter plaintext rendered URLs
Product: MediaWiki extensions
Classification: Unclassified
Spam Blacklist (Other open bugs)
All All
: Lowest normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: 20501 (view as bug list)
Depends on:
  Show dependency treegraph
Reported: 2008-06-13 03:50 UTC by Danny B.
Modified: 2011-03-13 18:06 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Danny B. 2008-06-13 03:50:03 UTC
If http://some.spam.tld on Spam-blacklist, it still lets out following constructions:

* <nowiki>http://some.spam.tld</nowiki>
* http&#x3a;//some.spam.tld (etc.)
Comment 1 Mormegil 2008-06-19 08:03:33 UTC
Isn’t that a feature? Those are not links, therefore they are not blocked. (And why should they be?)
Comment 2 Daniel Friesen 2008-06-19 09:30:37 UTC
Using the spam blacklist to block plaintext is not a good idea. Many generics are used inside of stuff added to the spam blacklist to stop many of the incoming urls. None of these are valid in urls, however they may consist of valid words in plaintext. If the spam blacklist were to become used for plaintext, then the spamfilter would start acting up everywhere blocking pages which really don't have spam on them.

And quite simply... We already have an extension for blocking plaintext, SpamRegex. The SpamBlacklist is for blocking urls only and is widely editable. SpamRegex is meant for blocking anything, and is more restricted because you can really screw things up if you do things in even the slightest wrong way.

There is no way to block plaintext in the way you want:
1) The SpamBlacklist extension only looks at parser output not the code, because of that if something has not been converted into a link, it does not know about it. Therefore plaintext cannot be blacklisted.
2) For things like, while you may recognize them as a url, there is no feasible way to make the computer understand that. At least, without an unacceptable amount of false positives which will make many valid edits trigger the spamfilter. Not to mention, that places normally use the SpamBlacklist's talkpage to post up spam urls to block, and they do it in plaintext. If plaintext were to become blacklisted, every time someone blacklisted a url, the talkpage for requesting backlists would become uneditable because of the new spamfilter addition.
Comment 3 Splarka 2009-09-04 23:00:05 UTC
*** Bug 20501 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.