Last modified: 2011-03-13 18:04:27 UTC
When I use the search engine (mnoGoSearch) to index my site I get lots of duplicate pages caused by redirects. What I propose is that redirect pages have the meta tag for robots changed to noindex. This might also help with search engines indexing wikipedia and make them get a better index of the site. current on all pages <meta name="robots" content="index,follow" /> proposed for redirect pages <meta name="robots" content="noindex,follow" />
an quick example of this is: http://en.wikipedia.org/wiki/Firefox Which gives Mozilla Firefox From Wikipedia, the free encyclopedia. (Redirected from Firefox) but the robots meta is <meta name="robots" content="index,follow" />
Changing summary to reflect the problem and upping severity to normal
Wouldn't this hurt rankings for alternate spellings of the title?
In order to allow alternate spellings to be indexed by search engines, the name of redirect articles could be mentioned in a meta tag of the target article, i.e. <meta name="keywords" content="...">. I don't know whether meta tags are honoured by search engines anymore, though.
I think that maybe, if the User-Agent is that of a major search engine (e.g. Google), then when requesting a redirect page, instead of the standard redirect, the bot should recieve an HTTP 301, so that it will consider it to be the same page as the target - reducing dups in the search results.
(In reply to comment #4) > In order to allow alternate spellings to be indexed by search engines, the name > of redirect articles could be mentioned in a meta tag of the target article, > i.e. <meta name="keywords" content="...">. > > I don't know whether meta tags are honoured by search engines anymore, though. this is the requirement of bug 846: feature request: control of meta name="KEYWORDS" content="..."
(In reply to comment #5) > I think that maybe, if the User-Agent is that of a major search engine (e.g. > Google), then when requesting a redirect page, instead of the standard redirect, > the bot should recieve an HTTP 301 I may be wrong about this, but I remember hearing that Google make spot checks of some kind with a user-agent of something like IE, to penalise sites that send completely different "optimised" content to the main crawler.
Restored bug from flood attack.
Closing this WONTFIX; this behavior is deliberate, so titles can be searched on.