Last modified: 2014-05-23 01:16:36 UTC
According to the Atom specification (http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.id), entries with the same id should represent the same entry. And because an entry in user contributions feed (e.g. http://en.wikipedia.org/w/index.php?title=Special%3AContributions/Svick&feed=atom&limit=50&target=Svick&year=&month=) represents an edit, each edit should have its own id, but currently, the id is the URL of the changed page. I think this causes repeated showing of the same edit in Google Reader.
id(In reply to comment #0) > According to the Atom specification > (http://www.atomenabled.org/developers/syndication/atom-format-spec.php#element.id), > entries with the same id should represent the same entry. And because an entry > in user contributions feed (e.g. > http://en.wikipedia.org/w/index.php?title=Special%3AContributions/Svick&feed=atom&limit=50&target=Svick&year=&month=) > represents an edit, each edit should have its own id, but currently, the id is > the URL of the changed page. > > I think this causes repeated showing of the same edit in Google Reader. I validate contributions ATOM feed with http://feedvalidator.org/check.cgi . validator says: : column 81: Two entries with the same id and I change url "feed=atom" to "feed=rss", validator says : column 84: guid values must not be duplicated within a feed http://..... : column 1: Missing atom:link with rel="self" so I think feed function has id check bug. and rss feed function has not correctly template.
Now, ATOM feed's id was made from only Article name. so, id (or rssfeed's guid) overlaps occurred. I think that id generator use mix of article name and edition number, this bug will fix.
It's exactly as KATO Takayuki says. Currently, feeds are built using for <guid> field in RSS and for <id> in Atom something like "http://xx.wikipedia.org/wiki/article_name". The solution doesn't seem too complicated and would consist in using instead of "http://xx.wikipedia.org/wiki/article_name" something like "http://xx.wikipedia.org/w/index.php?title=article_name&oldid=xxxxxxxx". This way, we would make sure that every entry has an unique identifier. But this change should be implemented as soon as possible because this issue is already causing trouble. Check this: http://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#To_Google_Reader_users:_You_may_be_missing_items_from_your_watchlist_feed. and this: http://www.google.com/support/forum/p/reader/thread?tid=4e28dcb545efabb3&hl=en
My first link has changed its url. Instead of http://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#To_Google_Reader_users:_You_may_be_missing_items_from_your_watchlist_feed. now it is http://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)/Archive_82#To_Google_Reader_users:_You_may_be_missing_items_from_your_watchlist_feed.
I think this is pretty much the same issue as the old bug 3998; technically what we're doing is valid (considering each page as an item, and we're including multiple versions of them) but it is indeed probably not matching up well with what receiving entities will be expecting. Probably best to change the feeds to go ahead and use ids that are specific to the revision and the way it's being displayed, ensuring that feed-processing systems do keep them separate in their caches. (My old arguments on bug 3998 are in the other direction, but I'm pretty convinced now that I was wrong in 2005. ;)
About what is being done is valid or not, all I can say is that my RSS Watchlist feed doesn't pass W3C validation... http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fen.wikipedia.org%2Fw%2Fapi.php%3Faction%3Dfeedwatchlist%26allrev%3Dallrev%26wlowner%3DCanyq%26wltoken%3D080630b3f4931ff5964fa7e69e6ee5a19871d1dc%26feedformat%3Drss or feed validator test http://www.feedvalidator.org/check.cgi?url=http%3A%2F%2Fen.wikipedia.org%2Fw%2Fapi.php%3Faction%3Dfeedwatchlist%26allrev%3Dallrev%26wlowner%3DCanyq%26wltoken%3D080630b3f4931ff5964fa7e69e6ee5a19871d1dc%26feedformat%3Drss My Atom Watchlist feed passes both tests but with recommendations related to this not unique id issue: http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fen.wikipedia.org%2Fw%2Fapi.php%3Faction%3Dfeedwatchlist%26allrev%3Dallrev%26wlowner%3DCanyq%26wltoken%3D080630b3f4931ff5964fa7e69e6ee5a19871d1dc%26feedformat%3Datom http://www.feedvalidator.org/check.cgi?url=http%3A%2F%2Fen.wikipedia.org%2Fw%2Fapi.php%3Faction%3Dfeedwatchlist%26allrev%3Dallrev%26wlowner%3DCanyq%26wltoken%3D080630b3f4931ff5964fa7e69e6ee5a19871d1dc%26feedformat%3Datom Finally, it must be remembered that, as I have proved, such a popular feed reader like Google Reader misses items from Wikipedia Watchlists very often due to this problem. In fact, I was using it to follow changes in Wikipedia articles I track but, as many of these articles are being controlled against vandalism, I can't accept these losses. Therefore, while this issue is fixed, I am following my watchlist manually, ignoring feeds. Obviously, I don't know how may people use Google Reader to control their Watchlists but for me, this is a serious problem with (I think) an easy solution.
In the last days, I've realized that there's been a change after which entries use now an unique identifier following the pattern: //es.wikipedia.org/w/index.php?title=[Article_title]&diff=[Edition_id] where [Article_title] is, of course, article title, and [Edition_id] is the edition number, which as far as I know, is an unique identifier. Therefore, the issue described in this page should be no longer a problem.