Last modified: 2005-12-07 15:06:52 UTC
IMPORTANT!!! DO *NOT* READ THIS BUGREPORT IF YOU HAVEN'T READ THE BOOK "Harry Potter and the Half-Blood Prince" AND YOU PLAN TO READ IT! -see the bugreport below- ... ... ... ... ... ... ... ... How to reproduce: 1. Query Google for "half blood prince" 2. See the fourth result - it reads: Harry Potter and the Half-Blood Prince - Wikipedia, the free ... For information on the character, see Half-Blood Prince (character). ... Harry pursues Snape, who identifies himself as the Half-Blood Prince before fleeing ... en.wikipedia.org/wiki/Harry_ Potter_and_the_Half-Blood_Prince - 47k - Cached - Similar pages As you can see, anybody who searches for information about the book in Google will be spoiled instantly. Internally, Wikipedia has the solution - the {{spoiler}} template. However, in Google searches the warning does not display. Therefore another approach is needed. Expected behavior: 1. On article access check the user-agent string sent. 2. If it's Googlebot's, return the page with all spoilers replaced by "---SPOILER---" or something similar. Spoilers be written in articles like <spoiler>dumbledore dies</spoiler> (or other markup).
INVALID, google has a "Dissatisfied? Help us improve" link on every search result, I suggest you use it.
Snape kills dumbledore.
>INVALID, google has a "Dissatisfied? Help us improve" link on every search result, I suggest you use it. How's a computer algorithm supposed to recognize something as abstract as plot spoilers? No, users should cope with it. Please, before replying, think whether it's really possible a satiable antispoil algorithm in Google to work. >Snape kills dumbledore. No comment...
(In reply to comment #3) > >INVALID, google has a "Dissatisfied? Help us improve" link on every search result, I > suggest you use it. > > How's a computer algorithm supposed to recognize something as abstract as plot spoilers? > No, users should cope with it. Please, before replying, think whether it's really > possible a satiable antispoil algorithm in Google to work. Well that's something for the google people to work out, not us.
I don't agree, but I have nothing else to say. I used the method you recommended.
It is not technically possible for us to prevent Google from indexing spoilers - how is the software supposed to know what is a spoiler and what isn't?
>It is not technically possible for us to prevent Google from indexing spoilers - how is the software supposed to know what is a spoiler and what isn't? I wrote that above. For example, we surround spoiling parts with <spoiler></spoiler> manually. As I said, IMO it's humans' job to tell spoilers apart from regular text.
See also: http://lists.w3.org/Archives/Public/www-html/2005Dec/0009.html
(In reply to comment #7) > >It is not technically possible for us to prevent Google from indexing spoilers - > how is the software supposed to know what is a spoiler and what isn't? > > I wrote that above. For example, we surround spoiling parts with <spoiler></spoiler> manually. As I > said, IMO it's humans' job to tell spoilers apart from regular text. So? Surround something in those tags either at the wikitext or XHTML level, and you'll find it's ignored at the former and rejected as invalid XHTML at the latter. And how does that stop GoogleBot seeing it?
(In reply to comment #9) > (In reply to comment #7) > > >It is not technically possible for us to prevent Google from indexing > spoilers - > > how is the software supposed to know what is a spoiler and what isn't? > > > > I wrote that above. For example, we surround spoiling parts with > <spoiler></spoiler> manually. As I > > said, IMO it's humans' job to tell spoilers apart from regular text. > > So? Surround something in those tags either at the wikitext or XHTML level, and > you'll find it's ignored at the former and rejected as invalid XHTML at the > latter. And how does that stop GoogleBot seeing it? I'm sorry, apparently I didn't make myself clear. What I meant was that some markup analogous to <spoiler></spoiler> has to be added to the wikicode specs, meaning that MediaWiki itself should be changed. (In reply to comment #8) > See also: http://lists.w3.org/Archives/Public/www-html/2005Dec/0009.html Now that I read this, I realized that indeed such a thing would be much better off in general XHTML. However, as seen at http://lists.w3.org/Archives/Public/www-html/2005Dec/0021.html the proposal seems to have been declined. I'm probably going to argue with them, because I don't agree with some of their points. For now I am convinced that the INVALID resolution fits this bug.