Last modified: 2010-06-27 08:21:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T11983, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 9983 - Google caches serious vandalism for a relatively long time.
Google caches serious vandalism for a relatively long time.
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Search (Other open bugs)
unspecified
All All
: Normal major (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-05-20 18:42 UTC by Jonathan Hochman
Modified: 2010-06-27 08:21 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Screen shot of Jim Carrey's search listing (193.51 KB, image/jpeg)
2007-05-20 18:42 UTC, Jonathan Hochman
Details

Description Jonathan Hochman 2007-05-20 18:42:31 UTC
Created attachment 3661 [details]
Screen shot of Jim Carrey's search listing

This week in two separate incidents high profile articles Wikipedia articles were vandalized and reverted within minutes.  During the short window of vandalism, Googlebot cached and displayed slanderous material for a full day at the top of the search results.  Due to the large number of articles in Wikipedia, and high search rankings, this is an increasing problem with the potential to damage the lives of article subjects and embarrass Wikipedia.

See attached screen shot, and these discussions:
* http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#Another_unfortunate_Google_grab
* http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard#Google_search_reveals_what_happens_if_vandalism_isn.27t_reverted_quickly...
* http://searchengineland.com/070516-164154.php

One resolution strategy is to use an allowable form of cloaking, called "content delivery."  We could apply the semi-protection criteria (not semi-protection itself) to article history to determine that last version that was saved by "good" user.  This version can be accessed with an additional URL parameter, such as ?version=lastgood.  When a search engine bot, such as Googlebot shows up and identifies itself (through the user-agent field in the http request header), you program a conditional redirect via .htaccess to append "?version=lastgood" to the URL, thus serving a slightly older, but more reliable copy of the page.  This  would avoid further embarrassment to Wikipedia, and help prevent harm to subjects of articles.  

Over at the Wikipedia Administrators' Notice board, it was suggested to file a bug report.  If you need further help with this, feel free to contact me.  I am a professional SEO and web developer who can donate services.
Comment 1 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-05-20 18:44:16 UTC
This will be fixed by stable versions, which appear to be moving ahead full steam.  Unless someone has a really quick and easy fix, there's probably not much point putting in effort to fix this immediately when it should be fixed soon enough anyway.
Comment 2 Jonathan Hochman 2007-05-20 18:50:14 UTC
Thanks.  Where can I follow the news and status of stable versions?  Bear with me.  I am new here.
Comment 3 Max Semenik 2010-06-27 08:21:53 UTC
FlaggedRevs are now live on en: Whether the community wants to retain them, and if to retain, to which extent to use them, is a non-technical problem.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links