Last modified: 2013-10-21 05:36:51 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 8475 - On-demand proxy scanning
On-demand proxy scanning
Status: NEW
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
PC Linux
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
  Show dependency treegraph
Reported: 2007-01-03 23:27 UTC by Neil Harris
Modified: 2013-10-21 05:36 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Neil Harris 2007-01-03 23:27:38 UTC
Some time ago, Wikipedia used to use the BOPM scanner to scan for proxies. This
is no longer the case, because the scans generated too many complaints.

I suggest the following as a compromise measure:
Run the BOPM procy tests on user IPs, but only perform scans when they, or a
logged-on user that uses them, gets blocked. A database table that records the
date of the last scan, if any, could be used in addition to this to skip the
proxy scan if the address had previously been scanned within a set time period;
one month might be a suitable setting for this. Since most IPs never get
blocked, the vast majority of user IPs would never be scanned at all.

This would not need any further user interface to be added, and the facility
could be turned on or off as needed by the systems admins. It could be initially
activated for a trial period, to gauge the number of complaints generated, and,
if this is low enough, could possibly be left on indefinitely.
Comment 1 Nemo 2012-08-23 19:06:21 UTC
Is this needed even if torblock worked (bug 30716)? Anyway, adding csteipp to cc so that he can check whether this could be useful for [[mw:Admin tools development]].
Comment 2 Chris Steipp 2012-08-24 00:01:35 UTC
Thanks Nemo!

Is the thought that a User is blocked (by name), this check runs to see if they were using an open proxy, and then the block would be extended to that ip address as well, if it turns out to be a proxy?

Or that this would automatically un-block an ip address when it was no longer an open proxy?

The previous scanning disallowed the edit if it came from an open proxy. I'm not sure what the desired output from this would be.
Comment 3 Jasper Deng 2012-08-24 00:12:48 UTC
Not all IP addresses that pass the test or fail the test are proxies, so human judgement is needed.
Comment 4 Chris Steipp 2012-08-24 18:03:00 UTC
Jasper, we can test if an ip is serving as an open proxy pretty definitively by connecting through it, assuming its really open and they haven't blocked our scanning server.

But yes, judgement is needed to determine if this technology is being abused, and if the abuse warrants a block.

I'm still trying to understand Neil's request though-- what is the desired outcome of scanning the ip after the block?
Comment 5 Neil Harris 2012-08-24 18:27:02 UTC
The idea is that without abuse, there would be no block, and therefore that the block, of itself, is evidence that the proxy is actively being abused at that time. 

Providing scanning occurs relatively soon after the block, this thus provides the element of human judgment needed as per comment 3; the resultant scanning and blocking (if the IP is found to be a proxy) can then be performed entirely automatically.
Comment 6 Neil Harris 2012-08-24 18:33:15 UTC
Also, because automatic scans based on blocks will only be performed on blocked IP addresses, most IP addresses connecting will never be scanned, greatly reducing irritation to others by unsolicited scans (if you don't want your IP scanned, don't do things that will get you blocked), as well as limiting its impact on system load and traffic (there are many edits per second across the cluster, but blocks are far rarer.)
Comment 7 Chris Steipp 2012-08-24 18:39:00 UTC
So if we scan the ip after setting the block, and the scan results show that an ip is also an open proxy (or not an open proxy), how do you see that affecting the block that has just been set? Would it alter the time period, or what the block prevents? Or just add a comment that this IP is also an open proxy?
Comment 8 Neil Harris 2012-08-24 18:52:02 UTC
If the open proxy block is either longer or more stringent than the existing block, it should replace the existing block with a block that combines the greater of the restrictions and length of both.

Since an open proxy block would typically be something like a one year block with everything (editing, talk page access, and account creation) restricted, I would imagine that this would be the common case.

Otherwise, it should just add a comment that the IP is also an open proxy to the block log.
Comment 9 Nemo 2012-08-24 18:56:06 UTC
Another difference could be a global block/blacklisting of the IP (this is something routinely done by stewards).
Comment 10 Neil Harris 2012-08-24 19:44:12 UTC
Comment 9 is a really good idea. That way, it's in addition to the local block, not instead of it, and there's no logic needed to combine blocks on local wikis. Perhaps something like a six month or one year blacklisting? 

Also, it might be worth re-scanning globally blocked IPs periodically before their blocks expire: if they are still proxies, they can be re-blocked for another period instead of being unblocked, and so on ad infinitum until they stop being proxies.

Note You need to log in before you can comment on or make changes to this bug.