Last modified: 2011-10-05 05:04:25 UTC
Created attachment 9104 [details] screenshot of "there is no page" for page that should exist I was browsing today on MediaWiki.org and got a "There is currently no text in this page. You can search for this page title in other pages, search the related logs, or edit this page. " For a page that definitely existed/had content (you could see it in the history view). I purged it, and the page's content re-appeared. I then hit special:random a bunch of times to see if this was a one-time issue or if more then one page was affected. Within about 10 special:random hits, I got another "missing" page (screenshot attached) which would seem to indicate that the problem is affecting quite a few pages... I see ES mentioned a lot in the sidebar bug (bug 31100) so possibly related(?)
Sidebar bug 31179 (went ahead and split it out) is possibly related to External Storage intermittent read failures, though we have not determined that for certain. I don't think the same thing would cause this though; it looks like the [[MediaWiki:noarticletext]] message only comes out for non-existent pages (no page table record); if you had an ES fail grabbing the text it would still think the page exists, and would come back showing you empty page contents.
Actually.... it would display that way. Down in Article::view() there's a check for empty content on pass 2 (after checking the parser cache) which looks for false (bogus/errored) text and then calls Article::showMissingArticle() which whips that message back out. So this could be plausible. :D
*** Bug 31288 has been marked as a duplicate of this bug. ***
These pages come with page_latest=0 Seems the same issue reported in wikitech-l breaking pywikipediabot. Looks like a 1.18 regression.
Could be; sounds like an editing/saving bug...?
I can't seem to find any rows in the DB with page_latest = 0 for MW.org.
*** Bug 31312 has been marked as a duplicate of this bug. ***
The page [[mw:Extension:Firefox_toolbar/fr/UI]] originally cited in this bug report doesn't appear to have been edited since 2007 and doesn't _appear_ to have page_latest=0 (at least from what I can see outside). It might not be the same problem as the ones that are reporting complete breakage, or it might have been 'incomplete' along the way.
Data points on nl.wikipedia.org: [[nl:Ben_Tiggelaar]] -- apparently had page_latest=0, has been fixed manually? [[nl:User_talk:RM21/Overleg]] - page record with no live revisions (partial deletion in 2007?). Was deleted/restored a couple times in the past. [[nl:Blankenbach]] - apparently has/had page_latest=0 but lots of other activity. Has been deleted and undeleted twice in last few days.
on simple.wikipedia.org: [[simple:Deal_or_No_Deal_UK]] - a redirect stub created 27 september, no other history
Recording pairs of page_ids and titles here, so we can look at them later. These are all page_latest=0 and page_is_new=1. ptwiki: 2212847 201.86.189.213 2212848 189.4.181.187 2212849 189.10.252.194 2213433 201.35.181.75 all these are namespace 3, all with timestamps 20090415... or 20090416... ruwiki: 1936451 Гинько,_Елена_Валерьевна 2475302 Kalashist eswiki: 2343377 190.134.174.70 2355993 200.64.55.191 2438562 200.50.8.74 3079560 186.9.18.135 all namespace 3 enwiki: 19113987 1r3gr37n0n 19399178 Vd437 20513897 Heelo1 21622712 88.107.34.0 22427099 98.108.121.19 22427147 72.227.225.75 22427162 68.59.212.62 22427179 146.186.59.74 22427194 HOTPOCKETSG 22427196 70.181.94.68 22427202 Jaeh0317 22441341 24.78.158.115 23391967 24.143.15.213 23398229 96.250.7.85 23400278 173.79.110.150 23994255 68.91.91.22 24090446 Mightym53821 all ns 3 as well. 8 of these have the timestamp 20090415 or 20090416. the rest are scattered about including one from 20010902
(In reply to comment #9) > Data points on nl.wikipedia.org: > > [[nl:Ben_Tiggelaar]] -- apparently had page_latest=0, has been fixed manually? > > [[nl:User_talk:RM21/Overleg]] - page record with no live revisions (partial > deletion in 2007?). Was deleted/restored a couple times in the past. > > [[nl:Blankenbach]] - apparently has/had page_latest=0 but lots of other > activity. Has been deleted and undeleted twice in last few days. Yeah, I fixed the first one.
Here is a problematic scenario: Situation: page restored when no live page already exists at that title 1) SpecialUndelete::undeleteRevisions() 2) SpecialUndelete::undeleteRevisions() does: $newid = $article->insertOn( $dbw ); A new page row is inserted with page_latest=0, page_len=0, page_is_new=1 This also sets mTitle in the WikiPage to have correct ID. 3) SpecialUndelete::undeleteRevisions() does: $oldcountable = $article->isCountable(); 4) isCountable() calls isRedirect(), which causes loadPageData() to be done on a slave. The data isn't there yet on the slave, so the page is loaded as not existing. loadPageData() does: $this->mTitle->loadFromRow( false ); ...which overrides mTitle in the WikiPage to 0 5) SpecialUndelete::undeleteRevisions() does: $article->updateIfNewerOn( $dbw, $revision, $previousRevId ), Called with $previousRevId = 0 since no page existed at the title before restoring 6) WikiPage::updateIfNewerOn() calls WikiPage::updateRevisionOn() since the page doesn't exist 7) WikiPage::updateRevisionOn() does: $conditions = array( 'page_id' => $this->getId() ); $this->getId() uses mTitle->getArticleID(), which was corrupted as 0. Thus, the UPDATE fails to update the row in (2), and it is stuck with those values
Good catch -- this looks like it could plausibly have been breaking pages off and on for a while (hence some of the older cases that aren't new). Probably Article::insertOn (rather WikiPage::insertOn) should save the updated ID and whatnot and mark itself as having loaded state (set $this->mDataLoaded). This'll bypass trying to reload the data, without having to explicitly say "btw load this from master".
Fixed in r98927.
Re-opening as some instances have come up still.
http://en.wikipedia.org/wiki/User_talk:72.27.85.119 and http://en.wikipedia.org/wiki/User:Anomie/Sandbox12 are a couple of current examples of this behavior.
Closing again. This bug originally referred to pages with several edits and broken page_latest values involving deletion/restoration. The "new instances" were actually an issue with new pages rather than existing ones. They were caused by...a bug in logging hack intended to confirm the absence of *this* bug, somewhat ironically. That code was removed a script was run to clean up all effected pages.