Last modified: 2011-10-25 23:55:11 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T28223, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 26223 - Errors in the revision table
Errors in the revision table
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
All All
: Low major (vote)
: ---
Assigned To: Nobody - You can work on this!
: shell
Depends on:
Blocks: 16660
  Show dependency treegraph
Reported: 2010-12-03 21:05 UTC by Betacommand
Modified: 2011-10-25 23:55 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---

query results (136.30 KB, text/plain)
2010-12-03 21:05 UTC, Betacommand

Description Betacommand 2010-12-03 21:05:59 UTC
Created attachment 7885 [details]
query results

there are 654 revisions in en.wp database where rev_page =0. the most recent revision is 399262554 which was made at 20101128034211. Ive attached a full listing of affected revisions for en.wp. I suspect this is a larger scale problem affecting multiple wikis.
Comment 1 Betacommand 2011-01-21 18:56:29 UTC
is a full report excluding s7 there are a total of 101 projects that are affected by this issue
Comment 2 Ariel T. Glenn 2011-01-24 18:21:50 UTC
I had a look at these.  I checked en wp carefully, and all of the incidents that aren't pretty old, before a certain date in 2008 (i.e. rev 242099935 on, and that's most of them) are moves.  It turns out that this is true for most revs that I spot checked on the other projects as well.

So what happens with these moves?  There are two revisions with the same move recorded in the log and the history; however three of them make it into the revision table.

Here's a sample from en wp:

rev_id    | rev_page | rev_text_id | rev_comment 
398410443 | 11005908 |   399836293 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing/sandbox]]...
398410417 | 11005908 |   399836293 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing)/sandbox]]...

those show up in the history, and they are the "good" ones, as they have a page id attached.  The "bad" one is

398410444 |        0 |   399843549 | moved [[Wikipedia:Tutorial (Editing)/sandbox]] to [[Wikipedia:Tutorial/Editing/sandbox]]...

I looked at a number of these and they all display the same characteristics:

the third rev is the bad one, it has the same time stamp as the previous one, and its text content is the redirect left behind by the move. 

Ie. the revision length of the bad one in the above is 48 and the text content is #REDIRECT [[Wikipedia:Tutorial/Editing/sandbox]]
where the rev length of the other two revisions is 2806 and they contain the actual page content.

This move issue is an outstanding issue, that is, it is not due to the master slave issue we had recently or any of that, That's clear from the timestamps, which in the above example well predate that outage.  In case someone might think that the revision used to have the page id once upon a time and that the corruption occurred later, I checked the history dumps from July and Sept of last year for a couple of these revisions with earlier time stamps, and the two good ones in each case appeared in the file but not the bad one.  That makes me pretty sure this is a failure at the time of the move, and probably still a bug in the code running now.

I hope that's enough information for someone who knows the innards of the move/delete stuff to hazard a guess at the problem.
Comment 3 Platonides 2011-03-20 01:11:26 UTC
There's no check in Title::moveToInternal() that Article::insertOn() really suceeded, so if that failed, the new revision would be created linking to a page with $newid = false, which would be converted to 0. Article::insertOn() fails if there's already a page with that title, we just renamed the page, so there should be no title with that page, and there's no trace of anyone recreating it behind us.

It is interesting that the page was first misrenamed, but I don't see any trace of what it did after moving to [[Wikipedia:Tutorial/Editing)/sandbox]]

This move was:
[[Wikipedia:Tutorial (Editing)/sandbox]]->[[Wikipedia:Tutorial/Editing)/sandbox]]
[[Wikipedia:Tutorial (Editing)/sandbox]]->[[Wikipedia:Tutorial/Editing/sandbox]]

How is this possible? Consider this: Fuhghettaboutit clicked to move the page, but noticed the typo immediatly, stopped the load, fixed the ')' and resubmitted. As Special:Movepage doesn't create a transaction, at that point *both requests were running at the same time* on the master. The second request fetched the old Article values, so moved the real article, not the redirect (maybe also because Title::moveto() does not call getArticleID() with GAID_FOR_UPDATE). But at the time of creating the redirect to the new entry, the first request had already created that. The insert ignore fails, but the revision is nonetheless inserted, leaking that entry.

I have been able to reproduce it locally.

touch lock
(while [ -f lock ]; do :; done; wget /index.php/Special:MovePage/Bug-26223 --post-data="action=submit&wpOldTitle=Bug-26223A&wpNewTitle=Bug-26223_$RANDOM&wpMove=yes&wpEditToken=%2B\\" )&
(while [ -f lock ]; do :; done; wget /index.php/Special:MovePage/Bug-26223 --post-data="action=submit&wpOldTitle=Bug-26223A&wpNewTitle=Bug-26223_$RANDOM&wpMove=yes&wpEditToken=%2B\\" )&
rm lock
Comment 4 Platonides 2011-10-25 21:14:40 UTC
For the record, the above bug (cause of revision leaking) was fixed in r84459.
Comment 5 Mark A. Hershberger 2011-10-25 23:55:11 UTC
(In reply to comment #4)
> For the record, the above bug (cause of revision leaking) was fixed in r84459.


Note You need to log in before you can comment on or make changes to this bug.