Last modified: 2010-05-15 15:41:15 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 8409 - add nofollow to links TO $wgNamespaceRobotPolicies items, not just ON
add nofollow to links TO $wgNamespaceRobotPolicies items, not just ON
Status: RESOLVED INVALID
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.7.x
PC Linux
: Low minor (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-12-28 19:06 UTC by Dan Jacobson
Modified: 2010-05-15 15:41 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Dan Jacobson 2006-12-28 19:06:26 UTC
Go ahead, in LocalSettings.php put
$wgNamespaceRobotPolicies = array(NS_WHATEVER => 'noindex,nofollow')

And then edit some [[Test page]] where you put a link to
[[Whatever:Something]]. And make sure Whatever:Something exists.

Now look at the rendering of [[Test page]].

Does the link to Whatever:Something have a nofollow added? No.
Should it? Yes
Does Whatever:Something have
   <meta name="robots" content="noindex,nofollow" />
in its header? Yes.
Is that good enough? Perhaps, but why cause the search engine's extra
GET in the first place, only to laugh in its face "Ha ha ha, fooled
'ya, no food here!"?

Don't combine the several related bugs I sent today into one as in the
current software state, they are certainly fixed faster piecemeal.

What about http://meta.wikimedia.org/wiki/Robots.txt saying:

  The only way to keep a URL out of Google's index is to let Google
  crawl the page and see a meta tag specifying robots="noindex".
  Although this meta tag is already present on the edit page HTML
  template, Google does not spider the edit pages (because they are
  forbidden by robots.txt) and therefore does not see the meta tag.

So hhmmmm, not sure.

P.S. I was going to use
$wgNamespaceRobotPolicies = array(
	NS_SPECIAL          => 'noindex,nofollow',
but as my site depends on search engines indexing Special:Allpages to
get its pages indexed, I will back off to just
	NS_SPECIAL          => 'noindex',
I suppose!

But wait, SpecialPage.php already has
		$wgOut->setRobotPolicy( "noindex,nofollow" );

Indeed, I was expecting Special:Allpages to be the main way for all
search engines to index my site, as yes I also have lots of
categories, but the are empty pages. I suppose I must write zero bytes
into those categories to make them not "edit" links so they can be
followed...

OK, that's enough thinking for one day.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links