Last modified: 2014-11-03 16:01:22 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T68450, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 66450 - Set $wgTitleBlacklistLogHits = true on WMF wikis
Set $wgTitleBlacklistLogHits = true on WMF wikis
Status: PATCH_TO_REVIEW
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: shell
Depends on: 21206
Blocks: SWMT
  Show dependency treegraph
 
Reported: 2014-06-10 20:41 UTC by Kunal Mehta (Legoktm)
Modified: 2014-11-03 16:01 UTC (History)
22 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2014-06-10 20:41:08 UTC
+++ This bug was initially created as a clone of Bug #21206 +++

Change merged in extension, needs WMF config update.
Comment 1 Gerrit Notification Bot 2014-06-10 20:41:32 UTC
Change 138684 had a related patch set uploaded by Legoktm:
Set $wgTitleBlacklistLogHits = true on all wikis

https://gerrit.wikimedia.org/r/138684
Comment 2 PiRSquared17 2014-06-10 21:19:22 UTC
It should be restricted to oversighters, per the privacy policy.
Comment 3 Kunal Mehta (Legoktm) 2014-06-10 21:22:20 UTC
OS or CU?
Comment 4 Ajraddatz 2014-06-10 21:28:30 UTC
CU, if it does contain private info. Might be best to consult the legal team.
Comment 5 John F. Lewis 2014-06-10 21:30:35 UTC
CU is more relevant than OS. So CU.

CC'd James in case LCA wants any comments on this.
Comment 6 James Alexander 2014-06-10 21:37:12 UTC
Roping Luis in, -1'd the patch turning it on just for now so that we know what it does. Is there a mediawiki page or something else that gives examples (or can someone help explain/point me in the right direction?) I see some suggestion of the IP of newly created users being in the log? Anything else?
Comment 7 Ajraddatz 2014-06-10 21:50:13 UTC
The log would show hits of users attempting to create accounts which trigger the TBL (global or local). An example log entry:  20:28, 9 June 2014 99.99.99.99 (talk | block) attempted to create "Account name" (rule: whatever rule prevents it). The formatting would be slightly different but that is the info that would be included. I've requested Pir2 to set up a test instance to confirm.

Unfortunately, this log is needed for the TBL to be usable. Currently there is no way to see what impact the TBL is having, and this would allow CheckUsers at least to confirm the impact of it.

An alternative would be to somehow take out the initiator from the log, listing only the name and which rule blocked it.
Comment 8 John F. Lewis 2014-06-10 21:52:27 UTC
In it's current form, CheckUser seems to be the only group which this would be usable for.
Comment 9 PiRSquared17 2014-06-10 21:57:57 UTC
Copying my gerrit comment here, with some annotations, for the record:
"Yeah, I'm actually a bit [read: very] surprised it [read: original patch for title blacklist log] was merged (I was about to abandon it, frankly, since pagemoves and editing cannot be logged), but I trust Legoktm reviewed it sufficiently. Perhaps it would be possible to only log the account name and not the IP. Another problem might be that when the log is viewed, there is no record (unlike CU log [for when people check others with Special:CheckUser]). Admins definitely should not be allowed to view it, as they can just add .* to the title blacklist and collect list of all IPs and usernames [that are blocked by the blacklist]."
Comment 10 James Alexander 2014-06-10 22:00:09 UTC
(In reply to Ajraddatz from comment #7)
> The log would show hits of users attempting to create accounts which trigger
> the TBL (global or local). An example log entry:  20:28, 9 June 2014
> 99.99.99.99 (talk | block) attempted to create "Account name" (rule:
> whatever rule prevents it). The formatting would be slightly different but
> that is the info that would be included. I've requested Pir2 to set up a
> test instance to confirm.
> 
> Unfortunately, this log is needed for the TBL to be usable. Currently there
> is no way to see what impact the TBL is having, and this would allow
> CheckUsers at least to confirm the impact of it.
> 
> An alternative would be to somehow take out the initiator from the log,
> listing only the name and which rule blocked it.

I think without the IP completely eliminates the concern, with the IP however is concerning and would want to be limited to just Checkusers (and related, stewards obviously count as well). 

The major use case that I've seen listed in the request is to see whether the filter is successful/useful (and potentially I guess if people are trying, and being blocked, from making legitimate accounts). Is that the main use case? For that use case having the IP/requester doesn't actually seem horribly useful and we should only reveal an IP if we have a strong use case for it.

Do the stewards or others believe that the IP/requester would be useful itself? (and what would be the primary use case for it).
Comment 11 PiRSquared17 2014-06-10 22:03:12 UTC
(In reply to James Alexander from comment #10)
> (In reply to Ajraddatz from comment #7)
> > The log would show hits of users attempting to create accounts which trigger
> > the TBL (global or local). An example log entry:  20:28, 9 June 2014
> > 99.99.99.99 (talk | block) attempted to create "Account name" (rule:
> > whatever rule prevents it). The formatting would be slightly different but
> > that is the info that would be included. I've requested Pir2 to set up a
> > test instance to confirm.
> > 
> > Unfortunately, this log is needed for the TBL to be usable. Currently there
> > is no way to see what impact the TBL is having, and this would allow
> > CheckUsers at least to confirm the impact of it.
> > 
> > An alternative would be to somehow take out the initiator from the log,
> > listing only the name and which rule blocked it.
> 
> I think without the IP completely eliminates the concern, with the IP
> however is concerning and would want to be limited to just Checkusers (and
> related, stewards obviously count as well). 
> 
> The major use case that I've seen listed in the request is to see whether
> the filter is successful/useful (and potentially I guess if people are
> trying, and being blocked, from making legitimate accounts). Is that the
> main use case? For that use case having the IP/requester doesn't actually
> seem horribly useful and we should only reveal an IP if we have a strong use
> case for it.
> 
> Do the stewards or others believe that the IP/requester would be useful
> itself? (and what would be the primary use case for it).

It would be easy to remove the IP from the log. That would mean CUs could not find it if they need it, however. It may be best to remove IPs from the log even if it prevents CUs from finding it, as they could CheckUser any successfully created accounts.
Comment 12 Ajraddatz 2014-06-10 22:15:03 UTC
We can already CheckUser any successfully created accounts, so no change from the status quo there. 

I would argue it is needed. If I globally blacklist a common string found in usernames being created by an LTA, being able to see the IPs he is using to try and create new accounts could allow for a proactive response - being able to block before they have even made an account to vandalize with. This is especially important if they are creating attack names. 

Even modified to not include the IP, the log would be useful to ensure that entries aren't blocking obvious good-faith names. This would also mean that sysops could view it. I'd certainly be fine with either option.
Comment 13 James Alexander 2014-06-10 22:16:58 UTC
(In reply to PiRSquared17 from comment #11)
> (In reply to James Alexander from comment #10)
> > (In reply to Ajraddatz from comment #7)
> > > The log would show hits of users attempting to create accounts which trigger
> > > the TBL (global or local). An example log entry:  20:28, 9 June 2014
> > > 99.99.99.99 (talk | block) attempted to create "Account name" (rule:
> > > whatever rule prevents it). The formatting would be slightly different but
> > > that is the info that would be included. I've requested Pir2 to set up a
> > > test instance to confirm.
> > > 
> > > Unfortunately, this log is needed for the TBL to be usable. Currently there
> > > is no way to see what impact the TBL is having, and this would allow
> > > CheckUsers at least to confirm the impact of it.
> > > 
> > > An alternative would be to somehow take out the initiator from the log,
> > > listing only the name and which rule blocked it.
> > 
> > I think without the IP completely eliminates the concern, with the IP
> > however is concerning and would want to be limited to just Checkusers (and
> > related, stewards obviously count as well). 
> > 
> > The major use case that I've seen listed in the request is to see whether
> > the filter is successful/useful (and potentially I guess if people are
> > trying, and being blocked, from making legitimate accounts). Is that the
> > main use case? For that use case having the IP/requester doesn't actually
> > seem horribly useful and we should only reveal an IP if we have a strong use
> > case for it.
> > 
> > Do the stewards or others believe that the IP/requester would be useful
> > itself? (and what would be the primary use case for it).
> 
> It would be easy to remove the IP from the log. That would mean CUs could
> not find it if they need it, however. It may be best to remove IPs from the
> log even if it prevents CUs from finding it, as they could CheckUser any
> successfully created accounts.

In the long run, if it's useful for checkusers, it is probably best going into the checkuser log database anyway (so if they CU'd the IP the attempts to create an account would come up). Without it going into the CU log table it is likely to be less useful just because it's "another" place that would need to be checked.
Comment 14 PiRSquared17 2014-06-11 00:06:30 UTC
If I85717770c9885b48f128474aad77833994714778 is merged, then the community and the legal team can choose whether to include IPs in the log.
Comment 15 PiRSquared17 2014-06-11 00:06:47 UTC
oops
Comment 16 billinghurst 2014-06-11 11:34:39 UTC
I am not certain why the privacy policy is coming into play as the priority, when the terms of use sit as equal level. At this point of time there is nothing to indicate which person is making the edits to identify against an IP, and could be considered no different from any normal standard IP edit, especially as we are not recording a username of the person editing, just the IP address and the TBL hit.

Here we are talking about accounts that are hitting the TitleBlacklist, which is bigger than account usernames, and also includes numbers of keywords. So let us then explore a little more ...

Do we chase someone down who uses a(n absolute)? vulgarity in a user name? Generally not, though we can. If we act, it is usually to block and to maybe block the IP for autoblock time.

Do we chase down someone who successfully finds a new variation of a TBL keyword? Between sometimes and probably, and we update the TBL.

Do we chase down an IP address of someone who is creating TBL-like pages with their IP address? No, we just block the IP address, and update the TBL.

So we have the situations of

Secnario A) — Limiting output to stewards and checkusers (both by default) that they they can see the IP address of someone who is 1) accidentally looking to circumvent the TBL, or 2) purposefully looking to circumvent the TBL.

In situation 1) I doubt that we will know who the person is, nor care, nor would take any action. We will not know what other accounts exist unless the IP address is reverse checked, and there is no reason to do so, and for dynamic IP addresses would be pointless.

In situation 2) Like with a revealed IP address, we won't necessarily know who they are or are not, than any other set of LTAs, and it won't really matter, we are more interested in terminating the abuse. If it is a known IP address for a vandal, CU normally familiar then, so nothing new anyway.

Exceptions to this may be where a person's name has been added to the GLOBAL TBL where it has been abused xwiki and added by agreement. To the point at a new wiki that a person creating a new account would have their IP exposed. This alone may be reason to limit an IP address, though such 

Scenario B) — We have no IP addresses, and limited to Stewards and Checkusers. We have nothing of value for situations 1 or 2, beyond "gee look someone is maybe abusing ... sit. 1) silly beggars; sit. 2) I hope they don't find a way around ... what a PITA. Heads up in case they do.

Scenario C) — We have no IP addresses and make it more visible to the advanced rights holders (admins +). Sit 1) Gee look! Sit 2) Gee look! Alert! (plus).  So we have more vigilance, and probably a lot more people watching the page for not a lot of reason beyond reaction time. Not necessarily a lot of value.

Scenario D) — We have IP addresses and make it more visible to the advanced rights holders (admin+). This has been addressed above as being bad as it could easily be (mis|ab)used by a poor addition. => not reasonable to have.

So to me, it would seem that if we are going to judge this by effectiveness it would be along the lines of
 A >> C > B  (not D)

So how do I see that the privacy policy does come into play?  That would be if we chose Scenario A) then we should only record that IP address temporarily, which would be up to three months (noting that the effectiveness of the IP address is probably only really good for a week to a month anyway).

Though maybe as has been indicated above, that situation C is the case that we then have the ability to have recorded and discoverable through a CU search, though I am not sure what we would have to be able to search in that space. If there is no username created, or no edit, for what are we searching? At this stage the only other means to find something is through a global filter, and at this stage I am unaware of any filter that is limited to checkusers.
Comment 17 Umherirrender 2014-06-11 21:12:09 UTC
AbuseFilter is using the new account name to show the hits on account creations in its own log. That means clicking the user name shows a user page with the hint, that no user exists, but there is no problem with ips. That sounds like the best solution to allow also sysop to see the log.

For checkuser the generated log entry should be given to checkuser and than gets saved to that table with the used ip. Than also tbl hits for logged in users can be found with the ip (for page creation or account creation from an existing account without bypassing the titleblacklist)
Comment 18 Luis Villa (WMF Legal) 2014-06-12 02:22:44 UTC
The privacy policy comes into play because the privacy policy always comes into play when recording IP addresses. That is non-negotiable.

This isn't to say IPs can never be recorded, but they should only be recorded when (1) there is a clearly stated and described reason for the recording and (2) the code is already written to delete or otherwise handle them after a reasonable retention period (likely 90 days but ideally shorter).

I'm not really clear if #1 is the case here; I didn't raise #2 initially because we're normally pretty good about that, but comment 16 implies otherwise?
Comment 19 Jackmcbarn 2014-06-12 02:30:54 UTC
Note that it's impossible for this to ever result in an account and IP being linked, since any creation attempts recorded in this log necessarily did not result in an account being created.
Comment 20 PiRSquared17 2014-06-12 02:42:11 UTC
I personally think just not recording the IPs at all [see comment 14] may be the best way to go, since having to remove old (90 days) entries in a log seems like something to avoid. Besides, the way the log works, one could just copy all the IPs/usernames at once before it expires. Not very good for privacy.

Another solution would be to just not enable it at all on Wikimedia (i.e. WONTFIX the bug).

(In reply to Jackmcbarn from comment #19)
> Note that it's impossible for this to ever result in an account and IP being
> linked, since any creation attempts recorded in this log necessarily did not
> result in an account being created.

That may be true in theory, but in practice it would most likely be possible to correlate the IPs with creations of accounts around the same time or with similar names.
Comment 21 billinghurst 2014-06-12 02:45:58 UTC
(In reply to Jackmcbarn from comment #19)
> Note that it's impossible for this to ever result in an account and IP being
> linked, since any creation attempts recorded in this log necessarily did not
> result in an account being created.

If you are indicating that without a reverse CU, then it would not be the case for never, just unusual and rarely.

The [[m:Titleblacklist]] has entries that have user names within them, so if one of those users went to create an account at a new wiki, they would not be able to do so, and it would log in the TBL log.

Even with some small wikis, they copy the Mediawiki:... files from enWP and utilise them at their wiki, and if they copy that TBL, and the user looks to create an account at that new wiki, we are in the same situation.

While such may possibly be remedied by oversight of the logs, I am not sure that there is an oversight capacity of the TBL log.


If you mean that with the IP address we could not identify a user, it may be anywhere between certain and impossible with checkuser, and that is due to the nature of the tool and IP addresses.
Comment 22 Jackmcbarn 2014-06-12 02:47:42 UTC
(In reply to billinghurst from comment #21)
> (In reply to Jackmcbarn from comment #19)
> > Note that it's impossible for this to ever result in an account and IP being
> > linked, since any creation attempts recorded in this log necessarily did not
> > result in an account being created.
> 
> If you are indicating that without a reverse CU, then it would not be the
> case for never, just unusual and rarely.
> 
> The [[m:Titleblacklist]] has entries that have user names within them, so if
> one of those users went to create an account at a new wiki, they would not
> be able to do so, and it would log in the TBL log.
> 
> Even with some small wikis, they copy the Mediawiki:... files from enWP and
> utilise them at their wiki, and if they copy that TBL, and the user looks to
> create an account at that new wiki, we are in the same situation.
> 
> While such may possibly be remedied by oversight of the logs, I am not sure
> that there is an oversight capacity of the TBL log.
> 
> 
> If you mean that with the IP address we could not identify a user, it may be
> anywhere between certain and impossible with checkuser, and that is due to
> the nature of the tool and IP addresses.

Attempts to autocreate an account aren't logged, so getting an IP that way isn't possible.
Comment 23 Ajraddatz 2014-06-12 02:55:35 UTC
I think that the most useful application of this log would be with IPs, thus accessible by CheckUser only (keeping in mind Jackmcbarn's valid observation with which I agree). If we were to disallow any action which could infer IPs to usernames then we'd need to disable anonymous editing entirely. 

I don't think that it would be more useful to hide the IPs from the log but make it CheckUser-able. Far better to be able to easily see which IP is trying to create those usernames so that it can be blocked to prevent abuse. Otherwise it would be impossible to check the IP until after an account had been made, and thus we'd be getting back into reactive rather than proactive territory.

Nonpublic information-granting tools are defined as "tool[s] that permits them to view nonpublic information about other users". Jackmcbarn's point here that this would never occur, due to any accounts listed in the log not being created, is important IMO. 

That said, if this is too much of a stretch, it would still be good to be able to see the log without IPs so I think that implementing it in some form would be a positive. The TBL is largely useless with no way to see its impact.
Comment 24 billinghurst 2014-06-12 03:00:32 UTC
(In reply to PiRSquared17 from comment #20)
> I personally think just not recording the IPs at all [see comment 14] may be
> the best way to go, since having to remove old (90 days) entries in a log
> seems like something to avoid. Besides, the way the log works, one could
> just copy all the IPs/usernames at once before it expires. Not very good for
> privacy.
> 
> Another solution would be to just not enable it at all on Wikimedia (i.e.
> WONTFIX the bug).
> 
> (In reply to Jackmcbarn from comment #19)
> > Note that it's impossible for this to ever result in an account and IP being
> > linked, since any creation attempts recorded in this log necessarily did not
> > result in an account being created.
> 
> That may be true in theory, but in practice it would most likely be possible
> to correlate the IPs with creations of accounts around the same time or with
> similar names.

Going back to where this was originally lodged at https://bugzilla.wikimedia.org/show_bug.cgi?id=1542#c4

As en effective tool to _prevent_ abuse the tool itself, logging and the IP addresses give the value for the prevention.

Without the IP address, the modification the log should be considered a reflective tool that allows you to assess the validity of the use of the tool.
Comment 25 Liangent 2014-06-12 03:02:58 UTC
(In reply to Jackmcbarn from comment #19)
> Note that it's impossible for this to ever result in an account and IP being
> linked, since any creation attempts recorded in this log necessarily did not
> result in an account being created.

There are aggressive TBL entries which block innocent user names. They request sysops to create those names afterwards. In this case users get linked to IPs.
Comment 26 billinghurst 2014-06-12 03:04:39 UTC
(In reply to Jackmcbarn from comment #22)
> (In reply to billinghurst from comment #21)
> > (In reply to Jackmcbarn from comment #19)
> > > Note that it's impossible for this to ever result in an account and IP being
> > > linked, since any creation attempts recorded in this log necessarily did not
> > > result in an account being created.
> > 
> > If you are indicating that without a reverse CU, then it would not be the
> > case for never, just unusual and rarely.
> > 
> > The [[m:Titleblacklist]] has entries that have user names within them, so if
> > one of those users went to create an account at a new wiki, they would not
> > be able to do so, and it would log in the TBL log.
> > 
> > Even with some small wikis, they copy the Mediawiki:... files from enWP and
> > utilise them at their wiki, and if they copy that TBL, and the user looks to
> > create an account at that new wiki, we are in the same situation.
> > 
> > While such may possibly be remedied by oversight of the logs, I am not sure
> > that there is an oversight capacity of the TBL log.
> > 
> > 
> > If you mean that with the IP address we could not identify a user, it may be
> > anywhere between certain and impossible with checkuser, and that is due to
> > the nature of the tool and IP addresses.
> 
> Attempts to autocreate an account aren't logged, so getting an IP that way
> isn't possible.

Not sure that I fully follow, and I don't know the backend code, however, ...

Autocreates are logged, so I am not certain why someone attempting to login to zzWX would not show on the TBL. Such autocreate account creations definitely show through RC and pop in IRC feeds.
Comment 27 Jackmcbarn 2014-06-12 03:05:34 UTC
(In reply to billinghurst from comment #26)
> (In reply to Jackmcbarn from comment #22)
> > (In reply to billinghurst from comment #21)
> > > (In reply to Jackmcbarn from comment #19)
> > > > Note that it's impossible for this to ever result in an account and IP being
> > > > linked, since any creation attempts recorded in this log necessarily did not
> > > > result in an account being created.
> > > 
> > > If you are indicating that without a reverse CU, then it would not be the
> > > case for never, just unusual and rarely.
> > > 
> > > The [[m:Titleblacklist]] has entries that have user names within them, so if
> > > one of those users went to create an account at a new wiki, they would not
> > > be able to do so, and it would log in the TBL log.
> > > 
> > > Even with some small wikis, they copy the Mediawiki:... files from enWP and
> > > utilise them at their wiki, and if they copy that TBL, and the user looks to
> > > create an account at that new wiki, we are in the same situation.
> > > 
> > > While such may possibly be remedied by oversight of the logs, I am not sure
> > > that there is an oversight capacity of the TBL log.
> > > 
> > > 
> > > If you mean that with the IP address we could not identify a user, it may be
> > > anywhere between certain and impossible with checkuser, and that is due to
> > > the nature of the tool and IP addresses.
> > 
> > Attempts to autocreate an account aren't logged, so getting an IP that way
> > isn't possible.
> 
> Not sure that I fully follow, and I don't know the backend code, however, ...
> 
> Autocreates are logged, so I am not certain why someone attempting to login
> to zzWX would not show on the TBL. Such autocreate account creations
> definitely show through RC and pop in IRC feeds.

I meant that autocreate attempts that fail due to the titleblacklist won't be logged.
Comment 28 PiRSquared17 2014-06-12 03:35:38 UTC
(In reply to Jackmcbarn from comment #27)
> I meant that autocreate attempts that fail due to the titleblacklist won't
> be logged.

I can confirm that this is correct as the author of the original code. It specifically does not log in the auto-creation hook. As far as I know, that should prevent these from being logged, but perhaps someone should test it to be sure.

(In reply to Liangent from comment #25)
> There are aggressive TBL entries which block innocent user names. They
> request sysops to create those names afterwards. In this case users get
> linked to IPs.

+1, this is a valid concern, which is another reason admins should not have this data (besides the fact that it violates the privacy policy)
Comment 29 Kunal Mehta (Legoktm) 2014-06-12 03:52:51 UTC
(In reply to PiRSquared17 from comment #28)
> (In reply to Jackmcbarn from comment #27)
> > I meant that autocreate attempts that fail due to the titleblacklist won't
> > be logged.
> 
> I can confirm that this is correct as the author of the original code. It
> specifically does not log in the auto-creation hook. As far as I know, that
> should prevent these from being logged, but perhaps someone should test it
> to be sure.

I did test that (before merging), and it worked as expected.

----

So I think the general consensus is that we *should* log the IP address and restrict to CU only? (Please correct me if I'm wrong...)

If so, are we going to need a script to remove the IP addresses after 90 days? Should it delete the entire log entry or just the IP address?
Comment 30 Luis Villa (WMF Legal) 2014-06-12 19:05:08 UTC
(In reply to PiRSquared17 from comment #28)
> +1, this is a valid concern, which is another reason admins should not have
> this data (besides the fact that it violates the privacy policy)

To be clear, we can log IP addresses under the privacy policy, and even share them publicly, if there is a good reason for it and we do appropriate clean up. I mean, for better or for worse, what we currently do with "anonymous" edits is permitted under the policy! We're just trying not to create new problems. :) 

(In reply to Kunal Mehta (Legoktm) from comment #29)
> So I think the general consensus is that we *should* log the IP address and
> restrict to CU only? (Please correct me if I'm wrong...)

I'm still not clear that I've seen a persuasive rationale that the addresses should be logged, and at least one person has said this won't get used if it isn't integrated into existing workflows. But I'm not super-familiar with the workflows here so I probably shouldn't be the final decision-maker here.

> If so, are we going to need a script to remove the IP addresses after 90
> days? Should it delete the entire log entry or just the IP address?

Addresses only is fine - those are the potentially identifying information here.
Comment 31 Ajraddatz 2014-06-12 19:22:12 UTC
The only person who has said this wouldn't be used unless integrated with existing workflows doesn't currently serve in a role which would make use of this. As I said in a post above, there would be no benefit to integrating it with CU - we'd be back to reactive, rather than proactive action since we'd only be able to find the IP if they successfully create an account. Integrating it with CU in addition to providing the IP would be beneficial.

To make perfectly clear the rationale for this: It gives us a unique opportunity to take proactive action while allowing us to confirm that we aren't generating ridiculous false positives using the TBL. It turns the TBL from an unusable extension to something which we can use alongside the abusefilter and spamblacklist to actively prevent abuse. With this, we could stop abuse before it has even happened - especially important when dealing with attack names. Is that a sufficiently-convincing rationale?
Comment 32 James Alexander 2014-06-12 20:31:33 UTC
(In reply to Ajraddatz from comment #31)
> The only person who has said this wouldn't be used unless integrated with
> existing workflows doesn't currently serve in a role which would make use of
> this. As I said in a post above, there would be no benefit to integrating it
> with CU - we'd be back to reactive, rather than proactive action since we'd
> only be able to find the IP if they successfully create an account.
> Integrating it with CU in addition to providing the IP would be beneficial.
>

I would question your statement that I would not make use of it as someone who does quite a lot of log reading and checkusering in my job ;). 

That said there is no doubt it would be most useful to stewards and other active global users which is why I was interested in their thoughts. If they/you think it would be more actively used as a separate log that is a mark in favor of having that separate log. My concern was mostly that relatively few people would be actively using it and so having it in the CU tables would be more beneficial (having it in the CU tables also allows it to take advantage of the CU automatic self destruct). I agree it would be beneficial integrating it with CU, and I can see some usefulness for the separate log if it would be used.   
 
> To make perfectly clear the rationale for this: It gives us a unique
> opportunity to take proactive action while allowing us to confirm that we
> aren't generating ridiculous false positives using the TBL. It turns the TBL
> from an unusable extension to something which we can use alongside the
> abusefilter and spamblacklist to actively prevent abuse. With this, we could
> stop abuse before it has even happened - especially important when dealing
> with attack names. Is that a sufficiently-convincing rationale?

I think it is, to be clear (I know you didn't say this, but I think some did) this current change is ONLY for accounts edits that trigger the TBL are not logged. 

I am not yet completely convinced that there is sufficient need to display the IP to non advanced users though (I would currently think CU/Steward) though I don't have a real issue with the log itself (without IP) being shown to Sysops or others who interact with the TBL/Abusefilter. My biggest worry is the 'ridiculous false positives' part, I can see a lot of cases where I'd be able to piece together IP information on regular users in those cases (though I can certainly see a high desire to want to KNOW when those cases are happening).
Comment 33 Ajraddatz 2014-06-12 21:14:47 UTC
(In reply to James Alexander from comment #32)

> I would question your statement that I would not make use of it as someone
> who does quite a lot of log reading and checkusering in my job ;). 

I thought about that after I posted my comment. Certainly no offence meant, and sorry if it was overly confrontational or dismissive. What I know is that I would regularly use it, and I've been trying to drive that point home here :).

My comment was also a bit confusing there. I think the most benefit would come from both having the log display IPs and being integrated with CU. Simply being integrated with CU would be little change from what we have now. In the interim, I think that the log displaying IPs would be best, and the integrating with CU when possible.

> I am not yet completely convinced that there is sufficient need to display
> the IP to non advanced users though (I would currently think CU/Steward)
> though I don't have a real issue with the log itself (without IP) being
> shown to Sysops or others who interact with the TBL/Abusefilter. My biggest
> worry is the 'ridiculous false positives' part, I can see a lot of cases
> where I'd be able to piece together IP information on regular users in those
> cases (though I can certainly see a high desire to want to KNOW when those
> cases are happening).

I'd agree that CU/steward access would make the most sense. There is definitely some potential for inferencing IPs --> accounts, but no more than what already exists. Most LTAs/spambots use mobile ranges it seems these days, so any check can have the potential for false positives. Same with the ACC interface, or even just looking at the histories of articles that new users have edited. I'd argue that the potential benefits here outweigh the potential risks.
Comment 34 PiRSquared17 2014-06-13 02:11:02 UTC
I wonder if it would be possible to somehow give admins access to a redacted version of the logs, and give checkusers the ability to see the IPs.
Comment 35 Umherirrender 2014-11-03 15:42:11 UTC
*** Bug 72905 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links