Last modified: 2008-05-22 18:18:00 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T16209, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 14209 - Addition of __NOSPIDER__ token for pages
Addition of __NOSPIDER__ token for pages
Status: RESOLVED DUPLICATE of bug 8068
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-21 13:49 UTC by FT2
Modified: 2008-05-22 18:18 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description FT2 2008-05-21 13:49:31 UTC
It sometimes happens that a user is discussed unfavorably due to disruption, sockpuppetry, edit warring, or vandalism. 

When they leave (or are banned), the pages they were discussed on are still google-able, which is a major dilemma - in extreme cases we have had to rewrite all their signatures, and often we have to courtesy blank the page, purely to prevent it being spidered. 

In a number of cases these users then return (with or without permission), which leads to a further problem as administrators seeking to understand history, cannot easily do so.

Would a token __NOSPIDER__ be possible? A page containing this token would render to an error page or blank, if the viewer was a robot or spider, in some manner (I don't know the best way technically). 

(Perhaps one easy way might be, if the viewer is an anonymous IP it goes to a page that says "This page is blocked from spiders, if you are a human please enter this CAPTCHA to view."  Most spiders/robots aren't logged in.)

This would be a useful tool to ensure our needs for edit histories and pages to remain useful to administrators in future, and the fair needs of a user not to be googled that way on the rest of their life, conflict less. Rather than having to wholesale edit swathes of the wiki, we could tag certain pages as __NOSPIDER__ and then they would rapidly drop off search engine caches (meeting the best interest of the party) and yet be more often able to be left intact (for us). It would also have the advantage that being invisible to the rendering engine for most users, and very easy to apply, we could actually use it more widely when this problem comes along.
Comment 1 FT2 2008-05-21 13:52:15 UTC
Defaulting to an error page if its a spider would be best (if technically possible), since this means the page title completely drops out of search engine caches. It's important the page title wouldn't be shown to a spider however it's done, since this often links to the name involved.
Comment 2 FT2 2008-05-21 14:12:34 UTC
Sadly after writing this, problems arise:

1/ potential huge burden if added, unless "only used in specific narrow cases"
2/ would kills search engine access to many pages if widely used, and right now search engines are the only effective way to find things even a few weeks old, in project space, and
3/ we already advise people to not use a readily searchable name on the signup page anyway, now

Ah well, a nice idea.
Comment 3 Brion Vibber 2008-05-22 18:18:00 UTC

*** This bug has been marked as a duplicate of bug 8068 ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links