Last modified: 2011-03-13 18:05:56 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T14841, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 12841 - Add Special:Whatlinkshere and Special:Recentchangeslinked to robots.txt
Add Special:Whatlinkshere and Special:Recentchangeslinked to robots.txt
Status: RESOLVED WONTFIX
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
unspecified
All All
: Lowest enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: shell
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-01-30 14:16 UTC by Thomas Bleher
Modified: 2011-03-13 18:05 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Thomas Bleher 2008-01-30 14:16:24 UTC
Looking through the access logs in my local wiki, I noticed that Special:Whatlinkshere and Special:Recentchangeslinked (including all their subpages) where downloaded by spiders. The files already contain the "noindex" attribute, so the spiders don't store the information, but the server still has to create them.

I propose that these pages are added as disallowed to the robots.txt file, to reduce server load and needed bandwith.
Comment 1 Brion Vibber 2008-02-02 00:41:35 UTC
That'd be a matter for your own robots.txt.
Comment 2 Thomas Bleher 2008-02-02 09:54:17 UTC
I know that I can add these to my robots.txt, and have in fact already done so. What prompted me to write this bug was that I noticed this issue and looked through de.wikipedia.org/robots.txt to see if these pages were already excluded there. As far as I can see, they aren't, so spiders will needlessly download these files from Wikipedia, causing unnecessary server load (the spiders download them just to throw them away immediately afterwards).

So this bug was a request specifically for Wikipedia, not for the MediaWiki software.
Of course, I may have missed something essential, in which case I'm sorry and you can re-close the bug.
Comment 3 JeLuF 2008-02-18 18:55:42 UTC
Adding hundreds of localized robots.txt entries would increase the size of the robots.txt to an enormous size, eating up the savings.

Additionally, most spiders have faster ways to index Wikipedia than accessing our special pages.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links