Last modified: 2011-03-13 18:04:48 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 13706 - Flagged bots to have exeption from spamlist
Flagged bots to have exeption from spamlist
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
All All
: Lowest enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
: 14691 (view as bug list)
Depends on:
  Show dependency treegraph
Reported: 2008-04-11 22:16 UTC by とある白い猫
Modified: 2011-03-13 18:04 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description とある白い猫 2008-04-11 22:16:59 UTC
The spamblocklist was created to combat spambots and people who add spam. If a page contains a spamlink users editing that page will be prevented from doing so. Admins are given an exception to this, which is fine. Bot flagged users (aka machines) should also be given the same exception. 

When a bot preforming a routine task this protection gets in the way. It is not like bot flagged accounts will add spam links and if they do they will not only be promptly blocked but also loose their flag. I think we can easily trust bot flagged accounts.
Comment 1 Jesse (Pathoschild) 2008-04-11 23:10:08 UTC
Are admins exempt? I can't save pages that contain blacklisted URLs on wikis where I am an admin.
Comment 2 Filip Maljkovic [Dungodung] 2008-04-12 09:24:18 UTC
AFAI've experienced, admins are not exempt from this. Thus, there's no reason for bots to be exempt.

On the other hand, I think both should be exempt (and also given a warning about spamlink's presence in the page) for obvious reasons.
Comment 3 とある白い猫 2008-04-13 16:23:16 UTC
There is no reason why a flagged interwiki bot should care about spam urls on pages. By the very nature of spam urls a human should review and decide weather or not to remove them.
Comment 4 Daniel Friesen 2008-04-15 02:24:33 UTC
I can understand a reason for exempting bots from spam filters.

# Vandal inserts spam url into a pile of pages.
# Admin adds the url to the blacklist (doesn't get rid of all the spam urls right away).
# Bot tries to edit the page to do some maintenance to the page completely independent of the spam.
# Bot receives a error page it may not be able to handle and either fails to do important maintenance, or crashes.

A bot editing a page which already has a Spam URL shouldn't crash just because a vandal already placed a url in the page. Could potentially break archiving bots or other ones which run actively to do cleanup or even revert unrelated vandalism.
Comment 5 Brion Vibber 2008-04-15 17:21:31 UTC
A bot that crashes because an edit was blocked is a broken bot that needs to be fixed. It's perfectly normal and expected for some edits to fail:

* the page may be protected
* the account may be blocked
* there may be transitory errors on the server
* etc

Normal workflow would be for the bot to log that the edit failed and/or try again later.
Comment 6 とある白い猫 2008-04-24 11:27:38 UTC
No the bot does not crash. It is denied the edit continuously for something that has nothing to do with the bot.

If the bots normal work flow is to
*...rename or remove images
*...add/remove/replace interwikilinks
*...any other non-controversial maintenance task

there is no reason to prevent it from making such edits just because

*...a vandal added a spam link some few years ago
* admin accidentally added wrong entry to the spam blocklist

Additionally bots editing protected pages is addressed at Bug #13137 separately.

If the intention is to find all pages containing a link that matches the spamlist regex strings, that is a seperate issue and can be a seperate bugzilla entry. 
Comment 7 Brion Vibber 2008-04-24 21:55:58 UTC
If there's no reason for it to affect a bot, why should it affect anyone? The hit allows the issue to be found and cleaned up.

This proposed special case makes no sense and will not ever be implemented.

Comment 8 Daniel Friesen 2008-04-24 23:09:38 UTC
Yes, it is true that if a bot shouldn't be affected by something, then a user shouldn't either.

However there is a difference.

When encountering a spam block, a bot does not know what to do with it. Bots are not programmed to automatically remove spam links when they are blocked, and never should be programmed in that way. Removing spam links is a human job as it is a human's job to discern whether the link should be removed, or if the link actually belongs and the spam entry should be revised (such as a spam block which is to broad in range and blocks good sites).

So, while both affected, the human is able to fix the situation, however the bot is not able to do that. Because it cannot do that, it either halts, or skips something it should be doing. And not all bots have someone actively watching them for when they halt.
Comment 9 Daniel Friesen 2008-04-24 23:35:57 UTC
Splarka noted that bots could comment out links rather than remove them.
That resolution could be alright.

Though my initial thought about the bug (misread the 'Flagged' part) was that rather than an explicit thing, this would actually be a 'spamexempt' flag similar to the 'editprotected' flag where a wiki could enable it if they feel that there is a reason for it to be enabled.
Comment 10 とある白い猫 2008-04-25 11:07:19 UTC
I think it is extremely unreasonable to expect each automated bot script to contain a "comment out" mechanism for spam urls.

Bots are not people. Bots are only allowed to deal with a specific task. I cannot make my bot preform a task just because I feel like it, bot policies do not allow such a thing. Each bot script is expected and required to do something very specific. The code is expected and required NOT to do anything else. This issue is getting in the way of commons deletions and image renames, interwikilinking and other tasks that are UNRELATED to spamlists.

There are also legitemate reasons to add spam urls to wikipedia pages. For example in a talk page discussion about a spambot attack users may choose to list and discuss spam urls the spambot(s) are adding. These urls will eventually make their way to the spamlist.

Same talk page or talk archive may contain an image that needs renaming or removing. You are saying the bot should be banned from replacing an image or adding an interwiki link simply because the page contains a spamlink.
Comment 11 Brion Vibber 2008-04-25 17:57:06 UTC
If your bot can't handle that some of its edits may be rejected, then it's completely unsuitable for use on a wiki. Pages may be protected, databases may be locked, IPs may be blocked -- that's the normal state of things.

These are *ALL* beyond the control of the bot. A bot that's not completely unsuitable for use will simply log the pages it can't make edits to, and the human can deal with them later.

Discussion is ended.
Comment 12 seth 2008-07-01 07:02:01 UTC
*** Bug 14691 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.