Last modified: 2011-03-13 18:06:26 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T20429, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 18429 - Allow filter rules to consider private data such as source IP, reverse DNS and user agent.
Allow filter rules to consider private data such as source IP, reverse DNS an...
Status: RESOLVED LATER
Product: MediaWiki extensions
Classification: Unclassified
AbuseFilter (Other open bugs)
unspecified
All All
: Lowest enhancement with 1 vote (vote)
: ---
Assigned To: Andrew Garrett
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-04-11 18:36 UTC by FT2
Modified: 2011-03-13 18:06 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description FT2 2009-04-11 18:36:32 UTC
Is there a way that functions such as "reversedns(user_ip) LIKE X" can ever be included in AbuseFilter?

The reason Im thinking this is, there are a number of major IPs where this might finally provide a means to prevent vandals that at present can't easily be. For example, a number of ISPs have large IP ranges and dynamic Ips, so a block is futile, the user just resets the router for a new IP. But whatever IP is given will resolve to (say) "adsl-*.region17.isp.net" and hence a pattern match on the reverse dns would allow edits by users in that specific area to be picked up, whereas at present no block or other automated system can spot or deal with these kinds of vandals.

Two caveats: 1/ is the cost of reverse DNS lookup prohibitive (and if so can it be cached locally to reduce that); 2/ does this introduce a class of AbuseFilter functions such as user_ip, that would mean the function is not displayed or able to be edited except by checkusers?

I think this is useful enough to explore further.
Comment 1 FT2 2009-04-11 18:39:13 UTC
s/function/filter
Comment 2 Gurch 2009-04-11 20:48:06 UTC
(In reply to comment #0)
2/ does this introduce a class of
> AbuseFilter functions such as user_ip, that would mean the function is not
> displayed or able to be edited except by checkusers?

Yeah... nice idea, but if this was implemented, it could be abused to pin down users' IP addresses/ranges. And I can think of several en.wikipedia administrators who *would* do that.
Comment 3 FT2 2009-04-11 20:54:17 UTC
It would be trivial to ensure that a filter function that used the "user_ip" variable could not be created, nor its logs/history read, by any except checkusers. That probably takes care of that one. There may be other ways to handle that point as well, but that seems the easiest. I don't see that as a major problem, just one needing careful concept thinking.
Comment 4 Andrew Garrett 2009-04-12 14:47:26 UTC
Discussed this on IRC with FT2. My general comments on the outcome of that discussion (from my perspective, FT2 may have different opinions):

1/ Adding additional hierarchy to AbuseFilter is a pain, both programmatically and socially.

2/ The fact that the abuse filter log is viewable by all users is a core principle guiding the Abuse Filter. It is critical that all filters may be assessed on their performance, if not on their construction. Smaller groups/cabals of checkusers, oversighters and what-not may have good intentions, but without the accountability of having the impact of filters assessed by the wider community. Smaller "cabals" encourage groupthink, and create an environment which may ease carelessness or outright negligence in filter construction.

3/ It would be technically trivial to hide variables containing private data from the abuse filter log, in order to allow them to be sent to filters.

4/ There are concerns (as expressed by Gurch) that the abuse filter log for filters using private data could allow users not identified to the Foundation to guess private information, or at least part of it (for instance, that a particular user edits from a particular IP range). The privacy policy permits disclosure of private data for the purposes of preventing and monitoring abuse of editing privileges, and covers only personally identifiable information. Residing on a particular range is not by itself personally identifiable information, although it may be private information; and while the user-agent header sent by a user is not public data, I would not really classify it as "private", per-se, and certainly not personally identifiable. Accordingly, I believe the benefits of hiding log entries for rules considering private data are outweighed by the detrimental effect on filter use transparency (see point 2).

Comment 5 Gurch 2009-04-12 20:40:42 UTC
(In reply to comment #4)
> Residing on a particular range is not by itself personally identifiable
> information, although it may be private information; and while the user-agent
> header sent by a user is not public data, I would not really classify it as
> "private", per-se, and certainly not personally identifiable.

You're right, it wouldn't (at least in most cases) count as such.

Though it could be used to determine, say, where a user is from. While I personally don't care who knows that, I know there are a lot of people out there who do -- imagine a "Contributors from XYZ" filter with IP ranges that geolocate to that place, in a private filter looking for (or claiming to look for) a particular abusive user from that area. Now any legitimate user editing from XYZ gets an entry in the abuse log linking their username to place XYZ, and that log entry is visible to everyone, not just admins. It's not exactly Checkuser but it's more disclosure than there currently is (I lack the patience and legal expertise to figure out exactly what the privacy policy's take is on this :)

Not sure what user-agent header has to do with anything, that (usually) only identifies the user's browser and OS. Though I am aware checkusers also have access to that information, I don't know what they do with it nor why anyone would want to use it for an abuse filter.
Comment 6 FT2 2009-04-12 22:32:36 UTC
"Now any legitimate user editing from XYZ gets an entry in the abuse log linking their username to place XYZ, and that log entry is visible to everyone, not just admins."

Incorrect, or else, overlooked the comment on this. 

See original suggestion: "It would be trivial to ensure that a filter function that used the 'user_ip' variable could not be created, nor its logs/history read, by any except checkusers."
Comment 7 Andrew Garrett 2009-04-13 13:53:43 UTC
(In reply to comment #6)
> See original suggestion: "It would be trivial to ensure that a filter function
> that used the 'user_ip' variable could not be created, nor its logs/history
> read, by any except checkusers."

I strongly object to that suggestion. It's okay for only checkusers to be able to create filters which act on private data, but the hit logs MUST be kept public. See my previous comment for further details of my position on this.

Comment 8 Gurch 2009-04-13 14:18:04 UTC
(In reply to comment #6)
> Incorrect, or else, overlooked the comment on this. 

Yeah, my comment was in reply to Andrew mostly. You say these filters should exist but be completely private, Andrew says if they do exist at all they have to be publicly logged in some way (unlike Checkuser), I say they shouldn't exist at all. Other than that, we're in perfect agreement. :)
Comment 9 Happy-melon 2009-04-16 17:07:40 UTC
I don't think those positions are necessarily mutually exclusive, although they are somewhat juxtaposed.  We current have "private" filters: the hit log is publically-viewable with abusefilter-view, this just says "User X tripped filter Y (ShortDescriptionOfFilterSetByFilterEditors), doing something somewhere".  Users with abusefilter-view-details can then see the exact parameters of the edit, *unless* the filter has been set to "private", in which case they need an additional permission (abusefilter-modify?).  It would be possible to create another class of filters, either implicitly or explicitly, for which you need the abusefilter-private permission to see anything more than the basic "X tripped Y" log.
Comment 10 Andrew Garrett 2009-04-24 03:46:41 UTC
Can we form an on-wiki consensus one way or the other for this, please?
Comment 11 Andrew Garrett 2009-07-03 13:01:02 UTC
Resolving as LATER in the absence of any community consultation.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links