Last modified: 2012-12-19 14:17:55 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T21990, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 19990 - text of revisions in the archive table that were deleted before Wikipedia started using MediaWiki 1.5 is corrupt
text of revisions in the archive table that were deleted before Wikipedia sta...
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
History/Diffs (Other open bugs)
unspecified
All All
: Lowest major (vote)
: ---
Assigned To: Tim Starling
http://en.wikipedia.org/w/index.php?t...
: shell
Depends on:
Blocks: 16660
  Show dependency treegraph
 
Reported: 2009-07-29 15:12 UTC by Graham87
Modified: 2012-12-19 14:17 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Graham87 2009-07-29 15:12:48 UTC
I was checking through deleted revisions in the main namespace by Conversion script on the English Wikipedia, to find old deleted edits to history merge:
http://en.wikipedia.org/w/index.php?limit=500&title=Special%3ADeletedContributions&target=Conversion+script&namespace=0

I found that in all pages deleted before Wikipedia was upgraded to MediaWiki 1.5 (late June 2005), all edits besides the latest one are corrupt. An undeleted example of these edits can be found above; the edits were previously at the title "Clearwater River, Idaho", and I history merged them to the existing article "Clearwater River(Idaho)". Another example involves the page about Michael Collins:
http://en.wikipedia.org/w/index.php?title=Michael_Collins&dir=prev&limit=6&action=history

The edits were previously at the title "Michael Collins (disambiguation)".

Even though 99.9% of the text in these old deleted archives is garbage, the other 0.1% is very important page history and it should not be corrupted.
Comment 1 Brion Vibber 2009-07-29 15:20:11 UTC
Possible external storage issue? Looks like something not getting un-gzipped or losing its flags.
Comment 2 Graham87 2009-07-29 15:46:58 UTC
I'm not sure if this is related, but some revisions before June 2005 are completely blank when they shouldn't be, as reported at this discussion on the technical village pump:

http://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)/Archive_62#Revision content disappeared

I didn't think much of it at the time, but both problems seem to involve Wikipedia text added before the upgrade to MediaWiki 1.5.
Comment 4 Graham87 2009-08-08 10:00:24 UTC
These deleted revisions from before June 2005 are fine:
http://en.wikipedia.org/wiki/Special:Undelete/Braille_music

They should stay deleted, since they were obviously nuked to make way for a page move.
Comment 5 Alexandre Emsenhuber [IAlex] 2009-08-28 07:26:06 UTC
This should be fixed in r55626.
Comment 6 Graham87 2009-08-28 08:23:08 UTC
It's fixed in the archive table where the MW 1.4 deleted revisions are.

However the undeleted edits to "Clearwater River (Idaho)" and "Michael Collins" that I mentioned above are still corrupt. I tried deleting and undeleting them, just in case, and that didn't fix the issue. I highly doubt there are many other revisions with this problem.

I'm not sure of proper protocol here : whether to re-open this bug, or start a new one ...
Comment 7 Tim Starling 2009-08-28 13:43:11 UTC
(In reply to comment #6)
> It's fixed in the archive table where the MW 1.4 deleted revisions are.
> 
> However the undeleted edits to "Clearwater River (Idaho)" and "Michael Collins"
> that I mentioned above are still corrupt. I tried deleting and undeleting them,
> just in case, and that didn't fix the issue. I highly doubt there are many
> other revisions with this problem.
> 
> I'm not sure of proper protocol here : whether to re-open this bug, or start a
> new one ...

Anything that was undeleted while the bug was active will now be permanently corrupted and will need to fixed manually.
Comment 8 Graham87 2009-08-28 15:17:03 UTC
Yikes, I thought as much. So ... what happens with this bug? The underlying issue is resolved but it's still caused damage that's seemingly hard to fix.
Comment 9 Alexandre Emsenhuber [IAlex] 2009-08-28 15:36:04 UTC
The only way to fix it is to update each corrupted row in the database, e.g. by adding manually "gzip" in the old_flags field. The problem is that it'd be very difficult to find the affected revisions automatically.
Comment 10 Graham87 2009-08-29 06:35:02 UTC
Then I'd like someone to fix the revisions I mentioned above:
http://en.wikipedia.org/w/index.php?title=Clearwater_River_(Idaho)&dir=prev&limit=16&action=history

and:
http://en.wikipedia.org/w/index.php?title=Michael_Collins&dir=prev&limit=6&action=history

As for finding other cases where it happened, for the English Wikipedia, check whether the revision ID is greater than 296,365,718 and the revision date is before July 2005, so when MW 1.4 was used. I use a revision ID of 296365718 because it's the last uncorrupted revision that I know of which was deleted that could've had this problem, see this diff:
http://en.wikipedia.org/w/index.php?title=User:Xaonon&diff=2406956&oldid=296365718

As far as I know, this would work because before MW 1.5 was used, a revision got a new rev_id when it was undeleted.
Comment 11 Diederik van Liere 2011-11-29 22:01:57 UTC
Tim, do you think this is something that still can and should be recovered or just close as WONTFIX?
Comment 12 Andre Klapper 2012-12-19 14:17:55 UTC
Realistically closing this as WONTFIX nowadays.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links