Last modified: 2011-03-13 18:05:23 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T7707, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 5707 - tell spiders not to index Wikipedia footer and navigation


Summary:	tell spiders not to index Wikipedia footer and navigation

Status:	RESOLVED WONTFIX

Product:	MediaWiki
Classification:	Unclassified
Component:	Parser (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Lowest normal with 1 vote (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:	http://www.mediawiki.org/wiki/How_bes...
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2006-04-24 21:49 UTC by S Page
Modified:	2011-03-13 18:05 UTC (History)
CC List:	0 users

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description S Page 2006-04-24 21:49:24 UTC

1.  In Google, search en.wikipedia.org for 'privacy':
http://www.google.com/search?hl=en&q=site%3Aen.wikipedia.org%20privacy

Results:
153,000,000 results!

This is because "Privacy" is in the footer, so Google matches every page.

Expected:
Turn off indexing of common areas.  I added some notes on how to do this to
http://www.mediawiki.org/wiki/How_best_to_search_or_spider_mediawiki_systems and
http://en.wikipedia.org/wiki/Robots_Exclusion_Standard#Directives_within_a_page ,
for Google the key is <!--googleoff: index--> ... <tt><!--googleon: index--> and
old spiders use <NOINDEX>.

You could counter-argue that if a word appears on a page and the user pastes it
into a search engine, then the engine MUST find that page.  But I think the
value of eliminating all those search results outweighs this.

Comment 1 Antoine "hashar" Musso (WMF) 2006-05-01 20:41:39 UTC

* NOINDEX is not valid in XHTML.
* the google only comment is ... only for google. That would not really fix the
issue.

Comment 2 Mark Clements (HappyDog) 2006-05-02 02:24:06 UTC

Hmmm.... I think there are several million people who use Google for their
searches (call me naive...)

Comment 3 Rob Church 2006-05-14 04:06:08 UTC

(In reply to comment #2)
> Hmmm.... I think there are several million people who use Google for their
> searches (call me naive...)

The point as raised was that Google isn't the be-all and end-all; people can and
do use alternative search engines, which that particular special case wouldn't
affect. So yes, it would fix the problem for a lot of cases, but not all.

Comment 4 Antoine "hashar" Musso (WMF) 2007-05-01 20:50:43 UTC

Marking as WONTFIX, looks like google is smart enough to give
us back pages related to "privacy".

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links