Last modified: 2010-05-15 15:41:15 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T10409, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 8409 - add nofollow to links TO $wgNamespaceRobotPolicies items, not just ON


Summary:	add nofollow to links TO $wgNamespaceRobotPolicies items, not just ON

Status:	RESOLVED INVALID

Product:	MediaWiki
Classification:	Unclassified
Component:	Parser (Other open bugs)
Version:	1.7.x
Hardware:	PC Linux

Importance:	Low minor (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2006-12-28 19:06 UTC by Dan Jacobson
Modified:	2010-05-15 15:41 UTC (History)
CC List:	0 users

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Dan Jacobson 2006-12-28 19:06:26 UTC

Go ahead, in LocalSettings.php put
$wgNamespaceRobotPolicies = array(NS_WHATEVER => 'noindex,nofollow')

And then edit some [[Test page]] where you put a link to
[[Whatever:Something]]. And make sure Whatever:Something exists.

Now look at the rendering of [[Test page]].

Does the link to Whatever:Something have a nofollow added? No.
Should it? Yes
Does Whatever:Something have
   <meta name="robots" content="noindex,nofollow" />
in its header? Yes.
Is that good enough? Perhaps, but why cause the search engine's extra
GET in the first place, only to laugh in its face "Ha ha ha, fooled
'ya, no food here!"?

Don't combine the several related bugs I sent today into one as in the
current software state, they are certainly fixed faster piecemeal.

What about http://meta.wikimedia.org/wiki/Robots.txt saying:

  The only way to keep a URL out of Google's index is to let Google
  crawl the page and see a meta tag specifying robots="noindex".
  Although this meta tag is already present on the edit page HTML
  template, Google does not spider the edit pages (because they are
  forbidden by robots.txt) and therefore does not see the meta tag.

So hhmmmm, not sure.

P.S. I was going to use
$wgNamespaceRobotPolicies = array(
	NS_SPECIAL          => 'noindex,nofollow',
but as my site depends on search engines indexing Special:Allpages to
get its pages indexed, I will back off to just
	NS_SPECIAL          => 'noindex',
I suppose!

But wait, SpecialPage.php already has
		$wgOut->setRobotPolicy( "noindex,nofollow" );

Indeed, I was expecting Special:Allpages to be the main way for all
search engines to index my site, as yes I also have lots of
categories, but the are empty pages. I suppose I must write zero bytes
into those categories to make them not "edit" links so they can be
followed...

OK, that's enough thinking for one day.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links