Last modified: 2014-09-23 19:45:21 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T26411, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 24411 - List recent User-Agents for a user or IP
List recent User-Agents for a user or IP
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
CheckUser (Other open bugs)
unspecified
All All
: Normal enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch, patch-reviewed
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-17 04:44 UTC by Jyothis Edathoot
Modified: 2014-09-23 19:45 UTC (History)
13 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch fixing the bug and cleaning up the code (15.24 KB, patch)
2010-11-12 05:51 UTC, Huji
Details
Updated patch, in accordance to recent changes in the user interface (10.12 KB, patch)
2010-11-18 18:34 UTC, Huji
Details

Description Jyothis Edathoot 2010-07-17 04:44:42 UTC
Similar to the other options in the current CU interface, it would be good to have an option to list all User Agents for a User or IP that was used in the past 90 days. This will save some time and help the checkusers to perform better better.

At times, we end up checking thru many shared IPs a user had used to find a different UA that can connect the two ids in question. It is possible that at a weak moment, we may fail to see the connection as we did not see both sharing that unique UA.
Comment 1 Huji 2010-11-12 05:51:01 UTC
Created attachment 7813 [details]
Patch fixing the bug and cleaning up the code
Comment 2 Huji 2010-11-12 05:52:38 UTC
In the patch I attached to the previous comment, I cleaned up the code (used more XML methods, etc), and fixed this bug.

The patch should be reviewed by someone with more knowledge on the indexes used in CU tables. I think the part that handles the index use is a little messy now.
Comment 3 Aaron Schulz 2010-11-12 06:53:13 UTC
The HTML stuff should be done alone as a patch against /trunk. Simple things like <td> tags should be left using raw string operations, but things with attributes, like <table>, should use HTML functions.

After that's applied, the UA patch can be put up.
Comment 4 Gurch 2010-11-12 09:49:45 UTC
More privacy invasion, lovely. Did you ever consider actually mentioning in the Foundation privacy policy that you track everyone's user-agent?
Comment 5 p858snake 2010-11-12 09:58:06 UTC
(In reply to comment #4)
> More privacy invasion, lovely. Did you ever consider actually mentioning in the
> Foundation privacy policy that you track everyone's user-agent?

We already scan them to get access to the site.
Comment 6 Gurch 2010-11-12 16:11:06 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > More privacy invasion, lovely. Did you ever consider actually mentioning in the
> > Foundation privacy policy that you track everyone's user-agent?
> 
> We already scan them to get access to the site.

Yep. Absolutely zero need to store them in order to do that, though -- storage of UAs is done solely to let checkusers loose on them. According to the privacy policy, you only store IP addresses (and you only use them to deal with abuse, rather than say to find out what editing tool someone is using, but hey at least *admitting* you store UAs would be a start).
Comment 7 Jyothis Edathoot 2010-11-12 17:14:48 UTC
Please read the checkuser extn documentation in mediawiki. I am not sure why you sound surprised with this. All checkusers should be using the tool per checkuser policy. if you have specific complaints about some one misusing it, raise it in that wiki. This is probably not the best place to argue about it. 

(In reply to comment #6)
> (In reply to comment #5)
> > (In reply to comment #4)
> > > More privacy invasion, lovely. Did you ever consider actually mentioning in the
> > > Foundation privacy policy that you track everyone's user-agent?
> > 
> > We already scan them to get access to the site.
> 
> Yep. Absolutely zero need to store them in order to do that, though -- storage
> of UAs is done solely to let checkusers loose on them. According to the privacy
> policy, you only store IP addresses (and you only use them to deal with abuse,
> rather than say to find out what editing tool someone is using, but hey at
> least *admitting* you store UAs would be a start).
Comment 8 Huji 2010-11-12 21:02:44 UTC
Aaron,

I will open a new bug for the HTML stuff and fix it, then I'll create a new patch against the new revision, which only deals with the current bug specifically.

Thanks for the advice,

Huji
Comment 9 Gurch 2010-11-12 21:08:36 UTC
(In reply to comment #7)
> Please read the checkuser extn documentation in mediawiki. I am not sure why
> you sound surprised with this. All checkusers should be using the tool per
> checkuser policy. if you have specific complaints about some one misusing it,
> raise it in that wiki. This is probably not the best place to argue about it. 

I'm not surprised about it. I do however disagree with the subversion of the privacy policy that implementation of this bug would contribute to.

Like the privacy policy, the CheckUser policy page on Meta also makes no mention of the fact that user-agent strings are stored.

The extension documentation does not mention it directly, but user-agent data is visible in the example screenshots.

I'm pretty sure having the information buried in a technical documentation page on another website doesn't fulfill the requirement of the privacy policy to state what data is being retained and why.
Comment 10 Aaron Schulz 2010-11-12 21:39:42 UTC
(In reply to comment #8)
> Aaron,
> 
> I will open a new bug for the HTML stuff and fix it, then I'll create a new
> patch against the new revision, which only deals with the current bug
> specifically.
> 
> Thanks for the advice,
> 
> Huji

Note that I changed the UI lately slightly. Update SVN first :)
Comment 11 Antoine "hashar" Musso (WMF) 2010-11-12 23:02:34 UTC
Gurch > please discuss the policy issue on meta or on wikimedia-l . This bug report is just about implementing the feature in the MediaWiki software.

Aaron > once reviewed, can you commit the patch so it get a larger audience ? Thanks :)
Comment 12 Huji 2010-11-18 17:49:00 UTC
r76949 uses XML instead of hard-coded HTML in the user interface.

I will send a separate patch as soon as I can, which adds the UA feature.
Comment 13 Huji 2010-11-18 18:34:23 UTC
Created attachment 7829 [details]
Updated patch, in accordance to recent changes in the user interface

The new patch is based on the most recent version of repository (hence being compatible with the recent UI changes). It follows the newly introduced UI modification of not having two separate radio buttons for IPs and Users when the action is basically similar.

Before getting committed, the patch should be reviewed in terms of its use of DB indexes.
Comment 14 Aaron Schulz 2010-11-18 19:27:22 UTC
Why do we need user-agents for an IP or IP range? "get users" already does that to an extent (and grouped by user).
Comment 15 Huji 2010-11-18 23:18:38 UTC
The whole point of adding this feature is to facilitate finding a UA without having to go through a long list of possibly duplicated UAs. Assume an IP range includes a very large number of users, and many of them have overlapping user-agents. When someone is only looking for specific user-agents, the tedious task of manually going through all these records could change into a rapid check of the list of unique user-agents used by the whole range.
Comment 16 Aaron Schulz 2010-11-18 23:19:55 UTC
Yes, but why are they looking at all agents for an IP range?
Comment 17 Huji 2010-11-19 05:24:35 UTC
To "rule out" someone with their user-agent of interest is actually editing from that range.
Comment 18 Gurch 2010-11-20 21:11:35 UTC
(In reply to comment #16)
> Yes, but why are they looking at all agents for an IP range?

because privacy invasion is fun!
Comment 19 Huji 2010-11-21 20:34:41 UTC
Dear Gurch,

You really need to find a better place to express your thoughts. Right here, we are only talking about the technical aspects of a software extension. How this extension is used on specific websites (an example is English Wikipedia) is not relevant here.

Wikimedia admins might decide not to allow using specific features of this extension on their website, as other might want to. Wikimedia owners might also want to explain the usage of user information in their privacy disclaimers in the way they prefer. Same applies to other wikis on the web. If you have a problem with them, talk with them! Don't continue to use this software bug tracker to bug the developers instead; I personally find that very insulting.

Wish you luck,

Huji
Comment 20 Huji 2010-11-25 22:01:22 UTC
Aaron,

Any comments? I'd rather be finished with one bug before proceeding to another one.
Comment 21 Aaron Schulz 2010-11-25 23:01:20 UTC
I have limited time to look at this. But, for a patch:
(i) All agent results should deal with duplicate agent strings and the time bounds for each (of the 5000 checked).
(ii) I would remove the "get agents for IP" or at least display the results per each account (or IP for non-logged in edits). Doing it on a range gives hard-to-use results.
Comment 22 MZMcBride 2010-11-25 23:51:34 UTC
(In reply to comment #19)
> You really need to find a better place to express your thoughts. Right here, we
> are only talking about the technical aspects of a software extension. How this
> extension is used on specific websites (an example is English Wikipedia) is not
> relevant here.
> 
> Wikimedia admins might decide not to allow using specific features of this
> extension on their website, as other might want to. Wikimedia owners might also
> want to explain the usage of user information in their privacy disclaimers in
> the way they prefer. Same applies to other wikis on the web. If you have a
> problem with them, talk with them! Don't continue to use this software bug
> tracker to bug the developers instead; I personally find that very insulting.

Is this new feature going to be wrapped in a configuration variable? If so, what will the default be? There are legitimate questions to be raised about the Wikimedia privacy policy when development is being done to an extension already installed on Wikimedia's wikis.
Comment 23 Huji 2010-11-26 06:16:37 UTC
(In reply to comment #21)
> I have limited time to look at this. But, for a patch:
> (i) All agent results should deal with duplicate agent strings and the time
> bounds for each (of the 5000 checked).
> (ii) I would remove the "get agents for IP" or at least display the results per
> each account (or IP for non-logged in edits). Doing it on a range gives
> hard-to-use results.

I'm not sure I can understand the first part; I agree with the second part.

(In reply to comment #22)
> Is this new feature going to be wrapped in a configuration variable? If so,
> what will the default be? There are legitimate questions to be raised about the
> Wikimedia privacy policy when development is being done to an extension already
> installed on Wikimedia's wikis.

I think it's going to end up as a "turned-on-by-default" feature, but Wikimedia people are going to be informed about this, so they could make sure it complies with the privacy policy (or otherwise turn this feature off).
Comment 24 Huji 2011-01-26 21:27:16 UTC
Any progress here? Has anybody had the time to check the patch?
Comment 25 Aaron Schulz 2011-01-27 00:27:10 UTC
(In reply to comment #23)
> (In reply to comment #21)
> > I have limited time to look at this. But, for a patch:
> > (i) All agent results should deal with duplicate agent strings and the time
> > bounds for each (of the 5000 checked).
> > (ii) I would remove the "get agents for IP" or at least display the results per
> > each account (or IP for non-logged in edits). Doing it on a range gives
> > hard-to-use results.
> 
> I'm not sure I can understand the first part; I agree with the second part.
Identical agent strings should be grouped together or consecutive ones collapsed or something.
Comment 26 Sumana Harihareswara 2012-02-16 23:47:36 UTC
Huji, thanks for the patch.  I'm marking it "reviewed" -- if you have time, please revise it to group together or collapse identical agent strings, as Aaron requested.  Thanks!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links