Last modified: 2011-08-19 20:45:47 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T13432, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 11432 - robots.txt for no.wikipedia.org
robots.txt for no.wikipedia.org
Status: RESOLVED INVALID
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Lowest enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
http://no.wikipedia.org/robots.txt
: shell
Depends on:
Blocks: robots.txt
  Show dependency treegraph
 
Reported: 2007-09-23 21:42 UTC by Bård Dahlmo
Modified: 2011-08-19 20:45 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bård Dahlmo 2007-09-23 21:42:16 UTC
Please add the following to robots.txt for nowiki:

Disallow: /wiki/Bruker
Disallow: /wiki/Brukerdiskusjon
Disallow: /wiki/Wikipedia:Administratorer
Disallow: /wiki/Wikipedia-diskusjon:Administratorer
Disallow: /wiki/Wikipedia:Sletting
Disallow: /wiki/Wikipedia-diskusjon:Sletting
Disallow: /wiki/Spesial

See http://no.wikipedia.org/wiki/Wikipedia:Tinget#S.C3.B8kemotorer_beh.C3.B8ver_ikke_se_alle_sider for discussion.
Comment 1 JeLuF 2007-09-24 17:56:55 UTC
Added:

# 11432
Disallow: /wiki/Bruker:
Disallow: /wiki/Bruker%3A
Disallow: /wiki/Brukerdiskusjon
Disallow: /wiki/Wikipedia:Administratorer
Disallow: /wiki/Wikipedia%3AAdministratorer
Disallow: /wiki/Wikipedia-diskusjon:Administratorer
Disallow: /wiki/Wikipedia-diskusjon%3AAdministratorer
Disallow: /wiki/Wikipedia:Sletting
Disallow: /wiki/Wikipedia%3ASletting
Disallow: /wiki/Wikipedia-diskusjon:Sletting
Disallow: /wiki/Wikipedia-diskusjon%3ASletting
Disallow: /wiki/Spesial:
Disallow: /wiki/Spesial%3A

Comment 2 John Erling Blad 2008-06-08 23:42:26 UTC
The previous will not always work, and the snippet belove should probably be somewhat better. It will not be a complete solution to the problem with search engines failing to find articles without parents, but it will allow an additional hack for some of the special pages.

Disallow: /wiki/Spesial:S%C3%B8k
Disallow: /wiki/Spesial%3AS%C3%B8k
Disallow: /wiki/Special:S%C3%B8k
Disallow: /wiki/Special%3AS%C3%B8k
Disallow: /wiki/Spesial:Tilfeldig_side
Disallow: /wiki/Spesial%3ATilfeldig_side
Disallow: /wiki/Special:Tilfeldig_side
Disallow: /wiki/Special%3ATilfeldig_side
Disallow: /wiki/Bruker:
Disallow: /wiki/Bruker%3A
Disallow: /wiki/Brukerdiskusjon:
Disallow: /wiki/Brukerdiskusjon%3A
Disallow: /wiki/User:
Disallow: /wiki/User%3A
Disallow: /wiki/User_talk:
Disallow: /wiki/User_talk%3A
Disallow: /wiki/WP:A
Disallow: /wiki/WP%3AA
Disallow: /wiki/Wikipedia:Administratorer
Disallow: /wiki/Wikipedia%3AAdministratorer
Disallow: /wiki/Wikipedia-diskusjon:Administratorer
Disallow: /wiki/Wikipedia-diskusjon%3AAdministratorer
Disallow: /wiki/Wikipedia_talk:Administratorer
Disallow: /wiki/Wikipedia_talk%3AAdministratorer
Disallow: /wiki/WP:S
Disallow: /wiki/WP%3AS
Disallow: /wiki/Wikipedia:Sletting
Disallow: /wiki/Wikipedia%3ASletting
Disallow: /wiki/Wikipedia-diskusjon:Sletting
Disallow: /wiki/Wikipedia-diskusjon%3ASletting
Disallow: /wiki/Wikipedia_talk:Sletting
Disallow: /wiki/Wikipedia_talk%3ASletting

This adds english namespaces in combination with norwegian names, shortcuts that otherwise bypass the robots exclusion rules and give explicit names for the search page. A few special pages that now uses $wgOut->setRobotpolicy( 'noindex,nofollow' ) should insted use $wgOut->setRobotpolicy( 'noindex' ). This should be done on at least Special:Newpages, and the page should be set up with a longer list of articles when it is hit by a search engine, or a special page optimized as a search engine index could be made. It is probably sufficient for smaller projects to keep the pages as is, while medium sized projects could modify Special:Newpages. Very large projects could possibly create so many new pages between each hit that a special solution could be necessary.

It should be fairly safe to let the crawlers follow links, while at the same time not allowing the indexers to use the pages. Many crawlers will then mark the pages as especially interesting pages, and will check back regularly.
Comment 3 John Erling Blad 2008-06-08 23:57:28 UTC
The segment

Disallow: /wiki/User:
Disallow: /wiki/User%3A
Disallow: /wiki/User_talk:
Disallow: /wiki/User_talk%3A

should be skipped as the file are common to all projects.
It should although be checked if this can be fixed somehow.
Comment 4 Bård Dahlmo 2008-06-09 05:34:17 UTC
Please hold any changes till proper community support is reached.
Comment 5 JeLuF 2008-07-04 19:30:24 UTC
Please reopen when community support is reached.
Comment 6 Max Semenik 2011-08-19 20:45:47 UTC
Changing robots.txt is now possible locally, by editing [[MediaWiki:Robots.txt]].

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links