Last modified: 2011-10-25 23:55:11 UTC
Created attachment 7885 [details] query results there are 654 revisions in en.wp database where rev_page =0. the most recent revision is 399262554 which was made at 20101128034211. Ive attached a full listing of affected revisions for en.wp. I suspect this is a larger scale problem affecting multiple wikis.
http://toolserver.org/~betacommand/reports/dberrors/ is a full report excluding s7 there are a total of 101 projects that are affected by this issue
I had a look at these. I checked en wp carefully, and all of the incidents that aren't pretty old, before a certain date in 2008 (i.e. rev 242099935 on, and that's most of them) are moves. It turns out that this is true for most revs that I spot checked on the other projects as well. So what happens with these moves? There are two revisions with the same move recorded in the log and the history; however three of them make it into the revision table. Here's a sample from en wp: rev_id | rev_page | rev_text_id | rev_comment 398410443 | 11005908 | 399836293 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing/sandbox]]... 398410417 | 11005908 | 399836293 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing)/sandbox]]... those show up in the history, and they are the "good" ones, as they have a page id attached. The "bad" one is 398410444 | 0 | 399843549 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing/sandbox]]... I looked at a number of these and they all display the same characteristics: the third rev is the bad one, it has the same time stamp as the previous one, and its text content is the redirect left behind by the move. Ie. the revision length of the bad one in the above is 48 and the text content is #REDIRECT [[Wikipedia:Tutorial/Editing/sandbox]] where the rev length of the other two revisions is 2806 and they contain the actual page content. This move issue is an outstanding issue, that is, it is not due to the master slave issue we had recently or any of that, That's clear from the timestamps, which in the above example well predate that outage. In case someone might think that the revision used to have the page id once upon a time and that the corruption occurred later, I checked the history dumps from July and Sept of last year for a couple of these revisions with earlier time stamps, and the two good ones in each case appeared in the file but not the bad one. That makes me pretty sure this is a failure at the time of the move, and probably still a bug in the code running now. I hope that's enough information for someone who knows the innards of the move/delete stuff to hazard a guess at the problem.
There's no check in Title::moveToInternal() that Article::insertOn() really suceeded, so if that failed, the new revision would be created linking to a page with $newid = false, which would be converted to 0. Article::insertOn() fails if there's already a page with that title, we just renamed the page, so there should be no title with that page, and there's no trace of anyone recreating it behind us. It is interesting that the page was first misrenamed, but I don't see any trace of what it did after moving to [[Wikipedia:Tutorial/Editing)/sandbox]] This move was: [[Wikipedia:Tutorial (Editing)/sandbox]]->[[Wikipedia:Tutorial/Editing)/sandbox]] [[Wikipedia:Tutorial (Editing)/sandbox]]->[[Wikipedia:Tutorial/Editing/sandbox]] How is this possible? Consider this: Fuhghettaboutit clicked to move the page, but noticed the typo immediatly, stopped the load, fixed the ')' and resubmitted. As Special:Movepage doesn't create a transaction, at that point *both requests were running at the same time* on the master. The second request fetched the old Article values, so moved the real article, not the redirect (maybe also because Title::moveto() does not call getArticleID() with GAID_FOR_UPDATE). But at the time of creating the redirect to the new entry, the first request had already created that. The insert ignore fails, but the revision is nonetheless inserted, leaking that entry. I have been able to reproduce it locally. touch lock (while [ -f lock ]; do :; done; wget /index.php/Special:MovePage/Bug-26223 --post-data="action=submit&wpOldTitle=Bug-26223A&wpNewTitle=Bug-26223_$RANDOM&wpMove=yes&wpEditToken=%2B\\" )& (while [ -f lock ]; do :; done; wget /index.php/Special:MovePage/Bug-26223 --post-data="action=submit&wpOldTitle=Bug-26223A&wpNewTitle=Bug-26223_$RANDOM&wpMove=yes&wpEditToken=%2B\\" )& rm lock
For the record, the above bug (cause of revision leaking) was fixed in r84459.
(In reply to comment #4) > For the record, the above bug (cause of revision leaking) was fixed in r84459. Closing