Last modified: 2013-10-21 05:36:51 UTC
Some time ago, Wikipedia used to use the BOPM scanner to scan for proxies. This is no longer the case, because the scans generated too many complaints. I suggest the following as a compromise measure: Run the BOPM procy tests on user IPs, but only perform scans when they, or a logged-on user that uses them, gets blocked. A database table that records the date of the last scan, if any, could be used in addition to this to skip the proxy scan if the address had previously been scanned within a set time period; one month might be a suitable setting for this. Since most IPs never get blocked, the vast majority of user IPs would never be scanned at all. This would not need any further user interface to be added, and the facility could be turned on or off as needed by the systems admins. It could be initially activated for a trial period, to gauge the number of complaints generated, and, if this is low enough, could possibly be left on indefinitely.
Is this needed even if torblock worked (bug 30716)? Anyway, adding csteipp to cc so that he can check whether this could be useful for [[mw:Admin tools development]].
Thanks Nemo! Is the thought that a User is blocked (by name), this check runs to see if they were using an open proxy, and then the block would be extended to that ip address as well, if it turns out to be a proxy? Or that this would automatically un-block an ip address when it was no longer an open proxy? The previous scanning disallowed the edit if it came from an open proxy. I'm not sure what the desired output from this would be.
Not all IP addresses that pass the test or fail the test are proxies, so human judgement is needed.
Jasper, we can test if an ip is serving as an open proxy pretty definitively by connecting through it, assuming its really open and they haven't blocked our scanning server. But yes, judgement is needed to determine if this technology is being abused, and if the abuse warrants a block. I'm still trying to understand Neil's request though-- what is the desired outcome of scanning the ip after the block?
The idea is that without abuse, there would be no block, and therefore that the block, of itself, is evidence that the proxy is actively being abused at that time. Providing scanning occurs relatively soon after the block, this thus provides the element of human judgment needed as per comment 3; the resultant scanning and blocking (if the IP is found to be a proxy) can then be performed entirely automatically.
Also, because automatic scans based on blocks will only be performed on blocked IP addresses, most IP addresses connecting will never be scanned, greatly reducing irritation to others by unsolicited scans (if you don't want your IP scanned, don't do things that will get you blocked), as well as limiting its impact on system load and traffic (there are many edits per second across the cluster, but blocks are far rarer.)
So if we scan the ip after setting the block, and the scan results show that an ip is also an open proxy (or not an open proxy), how do you see that affecting the block that has just been set? Would it alter the time period, or what the block prevents? Or just add a comment that this IP is also an open proxy?
If the open proxy block is either longer or more stringent than the existing block, it should replace the existing block with a block that combines the greater of the restrictions and length of both. Since an open proxy block would typically be something like a one year block with everything (editing, talk page access, and account creation) restricted, I would imagine that this would be the common case. Otherwise, it should just add a comment that the IP is also an open proxy to the block log.
Another difference could be a global block/blacklisting of the IP (this is something routinely done by stewards).
Comment 9 is a really good idea. That way, it's in addition to the local block, not instead of it, and there's no logic needed to combine blocks on local wikis. Perhaps something like a six month or one year blacklisting? Also, it might be worth re-scanning globally blocked IPs periodically before their blocks expire: if they are still proxies, they can be re-blocked for another period instead of being unblocked, and so on ad infinitum until they stop being proxies.