Last modified: 2011-01-25 00:19:07 UTC
Please add the following lines to robots.txt for pl.wiktionary: Disallow: /wiki/Wikisłownik:Strony_do_skasowania Disallow: /wiki/Wikis%C5%82ownik:Strony_do_skasowania Disallow: /wiki/Wikis%C5%82ownik%3AStrony_do_skasowania Disallow: /wiki/Wikisłownik:Bar Disallow: /wiki/Wikis%C5%82ownik:Bar Disallow: /wiki/Wikis%C5%82ownik%3ABar Disallow: /wiki/Wikisłownik:Bar/ Disallow: /wiki/Wikis%C5%82ownik:Bar/ Disallow: /wiki/Wikis%C5%82ownik%3ABar/ Disallow: /wiki/Wikisłownik:Tablica ogłoszeń Disallow: /wiki/Wikis%C5%82ownik:Tablica_og%C5%82osze%C5%84 Disallow: /wiki/Wikis%C5%82ownik%3ATablica_og%C5%82osze%C5%84 Disallow: /wiki/Wikisłownik:Tablica ogłoszeń/ Disallow: /wiki/Wikis%C5%82ownik:Tablica_og%C5%82osze%C5%84/ Disallow: /wiki/Wikis%C5%82ownik%3ATablica_og%C5%82osze%C5%84/
As far as I know you should configure this in your local MediaWiki:Robots.txt per bug 15601. If true, close as INVALID.
I have modified http://pl.wiktionary.org/wiki/MediaWiki:Robots.txt, but since there was no effect, I write here.
Changed summary to be in line with the problem description. The issue is not to add the pl.wp pages to the generic robots.txt, but it is a bug report on the robots.txt merge functionality.
(In reply to comment #2) > I have modified http://pl.wiktionary.org/wiki/MediaWiki:Robots.txt, but since > there was no effect, I write here. > I just checked for Meta, and it seems to have no effect there either. I doubt this should still be in Site requests as there seems to be a real bug with this feature (though adding things to the global robots.txt until it's fixed will be a workaround, I think).
Case matters. You have to edit Mediawiki:robots.txt, not Mediawiki:Robots.txt
So why do for example MediaWiki:Common.js and MediaWiki:Deletereason-dropdown (not: MediaWiki:common.js or MediaWiki:deletereason-dropdown) work? Mediawiki:robots.txt does not appear in Special:AllMessages, so how would you find out the correct spelling?
robots.txt handling is no MediaWiki-Feature. So there's no default message here, and thus, the page is not listed in the Special:AllMessages view before it has been created. http://pl.wiktionary.org/w/index.php?title=MediaWiki:Common.js http://pl.wiktionary.org/w/index.php?title=MediaWiki:common.js are different pages. One works, the other doesn't. Wiktionaries are case sensitive.
(In reply to comment #7) > robots.txt handling is no MediaWiki-Feature. So there's no default message > here, and thus, the page is not listed in the Special:AllMessages view before > it has been created. > robots.txt is defined in the WMF specific extension WikimediaMessages and therefore it is shown in Special:AllMessages. But that is not the point. Regardless of this it is impossible to create a message [[MediaWiki:robots.txt]] on wikis with $wgCapitalLinks = true; which is the default for mostly all WMF wikis with exception of the Wiktionaries. Please try to create [[de:MediaWiki:robots.txt]]. It's switches immediatly to [[de:MediaWiki:Robots.txt]].
(In reply to comment #8) > robots.txt is defined in the WMF specific extension WikimediaMessages and > therefore it is shown in Special:AllMessages. But that is not the point. > > Regardless of this it is impossible to create a message > [[MediaWiki:robots.txt]] on wikis with $wgCapitalLinks = true; > which is the default for mostly all WMF wikis with exception of the > Wiktionaries. > > Please try to create [[de:MediaWiki:robots.txt]]. It's switches immediatly to > [[de:MediaWiki:Robots.txt]]. > The point here is that the issue was reported on plwiktionary, which has $wgCapitalLinks = false; . On wikis with $wgCapitalLinks = true; , editing [[MediaWiki:Robots.txt]] will work.
(In reply to comment #9) > (In reply to comment #8) > > robots.txt is defined in the WMF specific extension WikimediaMessages and > > therefore it is shown in Special:AllMessages. But that is not the point. > > > > Regardless of this it is impossible to create a message > > [[MediaWiki:robots.txt]] on wikis with $wgCapitalLinks = true; > > which is the default for mostly all WMF wikis with exception of the > > Wiktionaries. > > > > Please try to create [[de:MediaWiki:robots.txt]]. It's switches immediatly to > > [[de:MediaWiki:Robots.txt]]. > > > > The point here is that the issue was reported on plwiktionary, which has > $wgCapitalLinks = false; . On wikis with $wgCapitalLinks = true; , editing > [[MediaWiki:Robots.txt]] will work. > Meta, which has $wgCapitalLinks = true; uses MediaWiki:Robots.txt, yet it doesn't seem to work. On pages which use __NOINDEX__ there is <meta name="robots" content="noindex,follow" />, however that is not so for pages which should have it because of MediaWiki:Robots.txt.
That's not how it works. There are two ways of blocking spider access to pages: when a spider first visits a site, it looks for a file called "robots.txt" in the root of the site, and follows the rules there to exclude certain tranches of pages. When it visits each individual page, it looks for the "robots" meta tag and, if one is present and tells it to go away, it does so, and 'forgets' that it was ever on the page. Modifying [[MediaWiki:Robots.txt]] appends entries to the site /robots.txt file (or is supposed to, anyway); it doesn't affect meta tags on pages.
(In reply to comment #4) > I just checked for Meta, and it seems to have no effect there either. I doubt > this should still be in Site requests as there seems to be a real bug with this > feature (though adding things to the global robots.txt until it's fixed will be > a workaround, I think). http://es.wikipedia.org/robots.txt does work (look at the bottom) Is it only working for wikipedias?
At this moment I see the content of http://pl.wiktionary.org/wiki/MediaWiki:robots.txt at the end of http://pl.wiktionary.org/robots.txt so marking this as INVALID, plus changing summary to reflect the solution.