Last modified: 2011-03-13 18:05:56 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 12841 - Add Special:Whatlinkshere and Special:Recentchangeslinked to robots.txt
Add Special:Whatlinkshere and Special:Recentchangeslinked to robots.txt
Status: RESOLVED WONTFIX
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
unspecified
All All
: Lowest enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: shell
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-01-30 14:16 UTC by Thomas Bleher
Modified: 2011-03-13 18:05 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Thomas Bleher 2008-01-30 14:16:24 UTC
Looking through the access logs in my local wiki, I noticed that Special:Whatlinkshere and Special:Recentchangeslinked (including all their subpages) where downloaded by spiders. The files already contain the "noindex" attribute, so the spiders don't store the information, but the server still has to create them.

I propose that these pages are added as disallowed to the robots.txt file, to reduce server load and needed bandwith.
Comment 1 Brion Vibber 2008-02-02 00:41:35 UTC
That'd be a matter for your own robots.txt.
Comment 2 Thomas Bleher 2008-02-02 09:54:17 UTC
I know that I can add these to my robots.txt, and have in fact already done so. What prompted me to write this bug was that I noticed this issue and looked through de.wikipedia.org/robots.txt to see if these pages were already excluded there. As far as I can see, they aren't, so spiders will needlessly download these files from Wikipedia, causing unnecessary server load (the spiders download them just to throw them away immediately afterwards).

So this bug was a request specifically for Wikipedia, not for the MediaWiki software.
Of course, I may have missed something essential, in which case I'm sorry and you can re-close the bug.
Comment 3 JeLuF 2008-02-18 18:55:42 UTC
Adding hundreds of localized robots.txt entries would increase the size of the robots.txt to an enormous size, eating up the savings.

Additionally, most spiders have faster ways to index Wikipedia than accessing our special pages.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links