Last modified: 2012-08-23 19:07:27 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T9413, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 7413 - Robot to scan links for session strings
Robot to scan links for session strings
Status: RESOLVED WORKSFORME
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-09-26 02:54 UTC by Amos Shapira
Modified: 2012-08-23 19:07 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Amos Shapira 2006-09-26 02:54:12 UTC
I've just noticed that the link from
http://en.wikipedia.org/w/index.php?title=Service-oriented_architecture&oldid=77758121#SOA_definitions
to TechEncyclopedia contains a ";jsessionid=..." part. The link works perfectly
without this random session-specific addition.

I manually removed this (and verified that the link still works). But was
wondering whether it would be worthwhile to have a robot to look for links which
contain such parts and automatically remove them or mark them to be checked by
humans.
Comment 1 Chad H. 2009-12-04 18:33:02 UTC
Resolving WORKSFORME. These sorts of things could easily be added to the URL blacklist or the AbuseFilter.
Comment 2 Amos Shapira 2009-12-04 21:28:50 UTC
The point is not to block such links as malicious but to detect that the sessionid was included by mistake and remove it, since I expect this is the usual case.
Comment 3 Chad H. 2009-12-04 22:22:15 UTC
The AbuseFilter can be set to warn instead of prevent the action.

Also, MediaWiki shouldn't be deciding that session IDs aren't supposed to be a part of URIs. They are a perfectly valid part of a URI and should be parsed as such. If a community wants to write a bot to do that, they can.

Re-suggest WFM.
Comment 4 Nemo 2012-08-23 19:07:27 UTC
Yes, this is typical use case for AbuseFilter or bot fixes.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links