Last modified: 2008-05-13 22:43:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T15398, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 13398 - Add bot generated spam reports on enwiki to robots.txt
Add bot generated spam reports on enwiki to robots.txt
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/robots.txt
: shell
: 13529 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-03-16 23:46 UTC by Alex Z.
Modified: 2008-05-13 22:43 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Alex Z. 2008-03-16 23:46:15 UTC
There have been numerous complaints to OTRS about enwiki's bot generated spam reports[1] showing up high on search results, associating the site with spam, even though that isn't always the case, many things will result in a report being generated: account name is "similar" to a domain, IP "close" to the domain adds the link, and there's little differentiation between someone who adds one link and someone who adds 100. 

Adding the following to robots.txt in the section to disallow all user-agents should fix this:
Disallow: /wiki/Wikipedia:WikiProject_Spam/LinkReports/
Disallow: /wiki/Wikipedia%3AWikiProject_Spam/LinkReports/

[1] http://en.wikipedia.org/w/index.php?title=Special%3APrefixIndex&from=WikiProject_Spam%2FLinkReports&namespace=4
Comment 1 Alex Z. 2008-03-27 01:55:24 UTC
*** Bug 13529 has been marked as a duplicate of this bug. ***
Comment 2 Ral315 2008-03-27 02:23:40 UTC
I would expand it to instead include the following:

Disallow: /wiki/Wikipedia:WikiProject_Spam/
Disallow: /wiki/Wikipedia%3AWikiProject_Spam/

That way, it includes everything under the main page, including a few pages that I think wouldn't be covered, but might need to be covered, by the robots.txt file.
Comment 3 Betacommand 2008-04-14 20:03:32 UTC
basic subpages should not be ignored the pages that cause the problems all start with Wikipedia:WikiProject Spam/Link so Wikipedia:WikiProject Spam/Link* and Wikipedia talk:WikiProject Spam/Link* should be added. ~
Comment 4 Brion Vibber 2008-05-13 22:43:20 UTC
Added:

# https://bugzilla.wikimedia.org/show_bug.cgi?id=13398
Disallow: /wiki/Wikipedia:WikiProject_Spam/
Disallow: /wiki/Wikipedia%3AWikiProject_Spam/

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links