Last modified: 2008-08-01 02:00:02 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T16933, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 14933 - New revisions occasionally created with wrong text (but correct rev_len)
New revisions occasionally created with wrong text (but correct rev_len)
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Highest critical (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/wiki/Wikipedi...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-07-26 10:57 UTC by Ilmari Karonen
Modified: 2008-08-01 02:00 UTC (History)
9 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ilmari Karonen 2008-07-26 10:57:18 UTC
It appears that recently some edits on the English Wikipedia (possibly elsewhere too?) have resulted in revisions that are blank or contain text from other, unrelated pages.  Oddly, the byte count reported in the page history (based on the rev_len field), as well the corresponding information in the recentchanges table, match the content that _should've_ been there.

For example, the revision http://en.wikipedia.org/w/index.php?title=Talk:Pikachu&oldid=227969847 is blank, even though the page history reports its length as 22,396 bytes.  See also discussion at:

http://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28technical%29#Bug:_revisions.2Fpagesizes.2Fpagerendering.2Fwikisource_not_matching_up.2C_resulting_in_blanking_or_page_replacements
http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Incidents#SYSTEM_BUG:_rollback_replaced_a_page_by_an_irrelevant_page_instead_of_reverting

I'm marking this as critical in case this is a symptom of more serious database corruption.  Feel free to downgrade if it turns out to be something more benign.
Comment 1 Splarka 2008-07-26 11:11:23 UTC
from http://toolserver.org/~amidaniel/chanlogs/%23mediawiki/20080726.txt ->

 [09:35:22] <Sadik_Khalid>	Hi, when I tried to edit this page (http://ml.wikipedia.org/wiki/%E0%B4%B2%E0%B5%82%E0%B4%AF%E0%B4%BF_%E0%B4%AA%E0%B4%BE%E0%B4%B8%E0%B5%8D%E0%B4%9A%E0%B4%B0%E0%B5%8D%E2%80%8D)  I am getting Egypt page (http://ml.wikipedia.org/wiki/Egypt)
 [09:37:45] <Sadik_Khalid>	History page don't mach with the content of the article

Comment 2 Ilmari Karonen 2008-07-26 11:16:36 UTC
Changing title since this occurs outside enwiki.
Comment 3 Aaron Schulz 2008-07-26 11:20:50 UTC
Possibly related to bug 14930
Comment 4 Aaron Schulz 2008-07-26 11:28:21 UTC
Also may be related to the recent ext. storage problems on one cluster (https://wikitech.leuksman.com/view/Server_admin_log)
Comment 5 Aaron Schulz 2008-07-26 11:36:04 UTC
OK, I can't find any relevant software changes. I'm almost sure this is due to the above issue. As things are now, as of now, no *new* edits should be recorded wrongly anymore.
Comment 6 JeLuF 2008-07-26 17:15:57 UTC
This happened due to a master switch on the external storage cluster.

Apparently, the new master didn't have an up-to-date replica of the master, a few records were missing. Due to this, the same text IDs were used twice. The edits saved on the old master that were not replicated to the new master are lost, no way to get them back.

I have to close this bug as "FIXED" because there's no "CANTFIX"
Comment 7 Tim Starling 2008-07-26 20:30:55 UTC
It wasn't fixed. srv104 still had an old copy of the configuration (because it's not reachable by ssh), and so it was still writing blobs to srv101. I've taken srv104 out of LVS rotation now. Maybe we'll be able to recover the edits from srv101 at some point, but it looks like it might be hanging on I/O now.
Comment 8 Stig Meireles Johansen 2008-07-27 14:56:51 UTC
Occured here as well: http://no.wikipedia.org/w/index.php?title=Vinterkrigen&diff=next&oldid=4096373 
Comment 9 jeroenvrp 2008-07-29 00:29:14 UTC
I can confirm this on nl.wikipedia to.

See e.g. http://nl.wikipedia.org/w/index.php?title=Yang_Yaozu&diff=13286529&oldid=13139324

In the recent changes this revision have added 15 bytes, but the page is empty: 
http://nl.wikipedia.org/w/index.php?title=Yang_Yaozu&action=edit&oldid=13286529

See also http://nl.wikipedia.org/w/index.php?title=Yang_Yaozu&action=history (2.159 bytes vs. 2.144 bytes).
Comment 10 jeroenvrp 2008-07-29 00:30:05 UTC
Ok I didn't saw it was fixed.
Comment 12 Splarka 2008-07-30 02:39:42 UTC
Unsure if related, but these do not show the revision #798283: 
* http://meta.wikimedia.org/w/index.php?title=Help:Magic_words&oldid=798283
* http://meta.wikimedia.org/w/index.php?title=-&oldid=798283 
* http://meta.wikimedia.org/w/index.php?title=Help:Magic_words&diff=prev&oldid=798283

And yet, these do (sort of):
* http://meta.wikimedia.org/w/index.php?title=Help:Magic_words&diff=798284&oldid=798283
* http://meta.wikimedia.org/w/api.php?action=query&prop=revisions&revids=798283&rvprop=size|content

Although, Per VP/T Tim said:
> It looks like the anomalous blank revisions are just cache pollution, and will 
> fix themselves when the cache expires in a week. The revisions that show the 
> wrong article are due to database corruption, and will need to be fixed manually.
Comment 13 Daniel Schwen 2008-07-31 19:25:13 UTC
This edit is attributed to my bot
http://commons.wikimedia.org/w/index.php?title=Image%3AHyena_pup.jpg&diff=13062289&oldid=12189366

But it is pretty much impossible that the bot performed it (nothing remotely similar to CopyVio tagging is in the source code).

Might be due to the same server issue, although the nature of the glitch seems different from the ones reported.
Comment 14 Platonides 2008-07-31 21:00:14 UTC
(In reply to comment #13)
> But it is pretty much impossible that the bot performed it (nothing remotely
> similar to CopyVio tagging is in the source code).
> 
> Might be due to the same server issue, although the nature of the glitch seems
> different from the ones reported.
 
Also note that the length reported in the history is larger than the edit.
I understand this happens becaouse the write goes to the false master and 
then the real one reuses the same revision id.

Probably we could find between the deleted revisions at a similar time, 
another with that same content.


Another magic blanking:
http://es.wikipedia.org/w/index.php?title=Wikipedia:Vandalismo_en_curso&diff=19107113&oldid=19107017
Comment 15 Tim Starling 2008-08-01 01:19:55 UTC
Should be fixed as of July 30, 03:00 UTC. Initially, ordinary edits processed by srv101/srv104 polluted the revision cache, which has an expiry of one week. This was identified and fixed (without me ever seeing this bug report) on July 27, by removing those servers from HTTP LVS. However, they continued to run the job queue, and refreshLinks jobs would have continued to pollute the revision cache. This was fixed on July 30, by firewalling srv101/104 from all core DB servers.
Comment 16 Tim Starling 2008-08-01 01:57:15 UTC
I'm running a script to fix the revision cache. This will make the old revision view and old revision edit work properly. Any broken diffs will have to be fixed manually by appending &action=purge to the diff URL. 
Comment 17 Tim Starling 2008-08-01 02:00:02 UTC
Note that the script only affects page blankings (which are due to cache pollution), not replacement with unrelated text, which is due to corruption of the core DB with incorrect text rows referencing blob_ids on the old cluster17 master, srv101.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links