Last modified: 2012-10-29 16:39:58 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T23279, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 21279 - Regular deletion of revisions deleted with rev_deleted breaks links in log entries
Regular deletion of revisions deleted with rev_deleted breaks links in log en...
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Revision deletion (Other open bugs)
unspecified
All All
: High critical with 4 votes (vote)
: ---
Assigned To: Brion Vibber
:
Depends on:
Blocks: revdel 28124 28819 28820 28821 SWMT 18780 21165 29068
  Show dependency treegraph
 
Reported: 2009-10-25 15:42 UTC by FT2
Modified: 2012-10-29 16:39 UTC (History)
19 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description FT2 2009-10-25 15:42:36 UTC
With admin level RevisionDelete now viewable and not the same as suppression in practice, we're seeing revisions with a wide range of combinations of delete/restore and RevDelete activity. 

A major problem has come to light. The differences in how deleted and undeleted revisions are referenced, discussed in bug 18104, are causing RevisionDelete to break badly all over the place. Examples:

1. Links in the suppression log often now error out. For example in my suppression 00:21 October 25 2009 (and several following it), "change visibility" now clicks through to non-working links. the links that once worked, no longer work, presumably since the revision(s) have since been deleted/undeleted.

2. Diffs no longer show page history properly. Revisions exist where something's missing in history but it's not clear what. For example the first suppression on the deleted article [[Chris Fournier]], by IP user 124.171.155.209. Clicking "diff" on this revision shows a diff that appears to delete data but not add any new data (nothing on right hand side), and a revision text that includes clear defamatory material that (apparently) was neither introduced in that diff, nor existed in the previous revision.

3. Revisions that were moved from "normal" to "deleted" after some RevDelete actions and before others, change their designation from "revision" to "archive" or something. RevDelete logs and page history becomes very hard to follow as the links fail and as revisions are split between multiple pages and sections within pages.


Sorry to throw this at the developers, but this is a ghastly mess; it's getting hard to track down what happened when something comes up on a wiki. 

I think trying to keep two parallel schemes for "deleted revisions" and "all other revisions" is to blame. 

The solution seems to be to ensure deleted revisions can also be looked up (and DIFFed) by their original revision_id even after deletion, which would mean RevisionDelete and other functionality wouldn't have to account for a possible change of reference id at some future time.


Suggestion for fixing: 

* Ensure that revision_id's will transparently link through to the correct deleted revision if the revision is deleted, preventing breakage of links when a revision is deleted or restored. This allows revisions to be referenced by one fixed identifier in RevisionDelete regardless of later deletion/restoration.

* Modify RevDel and all functions that add page information to the logs, so that all links related to a revision are referenced by the revision's revision_id alone, whether or not the revision is deleted.


A big job but it's got to the point that we need revisions accessible via the same reference id (possibly keeping timestamp as a 2nd identifier for ease and compatibility) whether they are deleted or not.
Comment 1 FT2 2009-10-25 15:46:23 UTC
(2 line summary:  The interaction of RevDel and traditional delete, which involves multiple systems of revision identification, is causing major problems for page referencing, and leading to widespread issues of broken links and "lost" page history, and apparently missing or incorrect diffs.)
Comment 2 Happy-melon 2009-10-25 17:00:13 UTC
Fundamentally, "deleting" stuff by moving it from the revision table to an archive table is not the most durable way of doing it.  Using another bit of the rev_deleted bitfield to indicate "deleted" would allow us to 'delete' just by setting that bit, and duplicate traditional oversight (bug20290) by setting the bit along with the "hide from admins" bit.  And it would facilitate selective deletion, etc etc.  But of course, that would involve rewriting our entire deletion and page management architecture (would have to stop deleting rows from the page table on deletion, but instead mark them with a page_deleted).  Any volunteers?  :D
Comment 3 FT2 2009-10-25 17:26:25 UTC
Making the transparent mapping "usual revision_id/url" -> "deleted revision_id/url" work, either as an additional field or a failover catch in software, might work reasonably well as a short term fix. Revision_id would then at least be guaranteed to locate all edits/links/revisions/diffs whether deleted or not, until a more complete fix was possible.
Comment 4 FT2 2009-10-27 17:14:23 UTC
See also bug 21312 for a possible option.
Comment 5 FT2 2009-12-01 23:17:14 UTC
Confirming for user info how this breaks in a separate case. If RevDelete was used on multiple revisions, and some of those revisions later change deleted/undeleted state, then an attempt to click on the RevDelete action's link in the delete log to view the revisions concerned, will show (eg) 5 revisions in the log, but only 3 revisions will be listed when clicked. Locating the other 2 actions the admin acted upon could be easy or almost impossible, depending on the history and deleted history of the page.
Comment 6 FT2 2009-12-11 17:10:29 UTC
(See also bug 18780 comment #12 for a few extra details)
Comment 7 Aaron Schulz 2010-01-30 09:32:39 UTC
I have a deleted_pages table local branch. It can get around this problem, the O(N) delete/restore problem, and page suppression issues.

Dusted it off today, with a few merge conflict fixes. I don't see anything changing soon (which I why stopped bothering with the branch after a while).
Comment 8 Platonides 2010-03-07 22:08:03 UTC
Aaron, time to merge into trunk?
Comment 9 DerHexer 2010-05-18 17:23:52 UTC
RevisionDelete was enabled now; what will happen with that urgent bug request concerning RevDelete?
Comment 10 Andrew Garrett 2010-05-19 00:12:34 UTC
Assigning to Aaron.

Aaron: The concise version of this bug is that revision deletion log entries contain links to the revisions in question, which fail when those revisions have been deleted using the standard action=delete method.
Comment 11 FT2 2010-05-19 00:58:20 UTC
... also in the much less common situation where RevDelete is applied to a deleted revision which becomes undeleted in future.

(Scenarios can arise where RevDelete would be applied to a deleted revision, but obviously not nearly as common)
Comment 12 Andrew Garrett 2010-05-19 10:17:30 UTC
Steps to reproduce (on a local trunk wiki):

1. Create a page with two revisions.
2. Use the single-revision deletion (rev_deleted column) interface to delete the first revision of the page.
3. Delete the page.
4. Undelete the page.
5. View Special:Log/delete for the page.
6. Click on one of the "more", "diff" or "change" links.

Expected result:
The interface shows the diff, suppression details or visibility editing interface.

Actual result:
An error message is shown.
Comment 13 FT2 2010-05-19 10:20:34 UTC
Trivial example page (4 revisions, 2 revdeleted actions) exists at: http://en.wikipedia.org/wiki/Special:Undelete/User:FT2/test4 showing the errors generated by the 2 different modes of failure of log entries (revdeleted then delete, or revdeleted then undelete).
Comment 14 Platonides 2010-05-19 11:09:44 UTC
(In reply to comment #12)
> Steps to reproduce:

Works for me on http://test.wikipedia.org/wiki/Bug_21279_-_Two_revs
Comment 15 DerHexer 2010-05-28 20:40:13 UTC
The same bug happens with page moves.

Please have a look at http://en.wikipedia.org/w/index.php?title=Special:Log&dir=prev&offset=20100331100647&limit=1&type=delete&user=DerHexer Clicking on "more..." takes you to http://en.wikipedia.org/w/index.php?title=Special:RevisionDelete&target=User%3ADerHexer%2Ftest2&type=revision&ids=364727852 which tells me "No matching items in log." That only happened because I've moved my page: http://en.wikipedia.org/w/index.php?title=User:DerHexer/test3&action=history Nobody can see now who did that revision deletion. That's a transparency issue which should be fixed as fast as possible.
Comment 16 FT2 2010-06-09 10:37:49 UTC
Updated test - fix needed:

1/ revision with obvious deletions is not showing log entries.

Look at the first RevDel action at

http://en.wikipedia.org/w/index.php?title=Special:Log&page=User%3AFT2%2Ftest7

The link for "visible revisions" leads to a RevDel page with 2 log actions (correct). The link for "deleted revisions" leads to a RevDel page with none (error).
Comment 17 FT2 2010-06-09 10:41:20 UTC
At the same time, can the RevDel page indicate clearly if the revisions listed are "Visible revisions" or "Deleted revisions". It would help.

Example page:

http://en.wikipedia.org/w/index.php?title=Special:RevisionDelete&target=User%3AFT2%2Ftest7&type=revision&ids=366963568

Simple fix, if the text at the top states "Selected [visible | deleted] revision(s) of {{PAGENAME}}"

and under "Other/additional reason" a line in bold "These revisions are currently [visible | deleted]".

Thanks!
Comment 18 FT2 2010-06-09 11:03:58 UTC
Visible revisions have checkboxes and a "del/undel selected revisions" for multiple revisions in an action.

Can [[Special:Undelete]] have a similar button added? It already has the checkboxes.

A "del/undel selected revisions" button would be useful (eg to remove a RevDelete redaction on multiple deleted revisions), or to prevent the data being visible if deletion is reversed (eg page deletion is undone)
Comment 19 FT2 2010-06-09 11:07:43 UTC
2nd fail case, see:

http://en.wikipedia.org/w/index.php?title=Special:Log&page=User%3AFT2%2Ftest8

Reproduction:

1. Create page with 3 revisions
2. Delete page
3. RevDelete a field from middle revision (eg edit summary)
4. Undelete page

Page delete log for the RevDelete action fails with "bad links".
Comment 20 FT2 2010-06-09 11:16:32 UTC
Last, when the above fail case is fixed, can you check it also fixes the following (minor) fail case which I believe may be a byproduct of it.


I considered what happens if RevDelete is used on on old deleted revision without a rev_id and the revision is later restored - the issue being no rev_id so would log links then fail.

I theorized that if such a revision were selective undeleted and then redeleted, the selctive undeletion assigns a rev_id, so once redeleted this wouldn't be a problem, providing a quick workaround. In other words:

1. Deleted revision without rev_id is redacted
2. Revision is undeleted
3. Deletion log links for the RevDelete action may fail.

WORKAROUND:

1. Deleted revision without rev_id is undeleted and gains a rev_id, then redeleted
2. Revision is _then_ redacted and undeleted at any future time
3. Deletion log links for the RevDelete action should work correctly as the redaction will be linked to a correct permanent rev_id.


Can you check this at the same time? I suspect the issue at comment #19 may be affecting this so it can't be tested yet. I appreciate it's a pure edge case, but it's possible we may see some revisions moving from deletion to revdelete and it would be nice to check the log links issue wouldn't resurface or be an issue.
Comment 21 Happy-melon 2010-06-20 18:37:41 UTC
(In reply to comment #18)
> Visible revisions have checkboxes and a "del/undel selected revisions" for
> multiple revisions in an action.
> 
> Can [[Special:Undelete]] have a similar button added? It already has the
> checkboxes.
> 
> A "del/undel selected revisions" button would be useful (eg to remove a
> RevDelete redaction on multiple deleted revisions), or to prevent the data
> being visible if deletion is reversed (eg page deletion is undone)

These are not directly related to this bug; please file (or find, I suspect this may have been raised before) separate bugs for them.

(In reply to comment #20)
> I theorized that if such a revision were selective undeleted and then
> redeleted, the selctive undeletion assigns a rev_id, so once redeleted this
> wouldn't be a problem, providing a quick workaround. In other words:

You could check whether this definitely occurs with an old revision of the sandbox or something.  I can imagine it actually making even more of a mess than that if ar_rev_id is null.
Comment 22 FT2 2010-06-20 21:36:43 UTC
(In reply to comment #21)

1/ I'm using this bug as a tracking bug for the few last issues Werdna needs to check, to put the "RevDelete log links" issues to bed at this point.  It sounds like he'll be tackling the last few together so this makes it easier for him to check them off, rather than opening several bugs for matters he will address in common.  of course any tagged as "wontfix" or "later" may need extra bugs opened, but I think on this one, let's see how it goes before creating paperwork that may never be needed.

2/ I've tested the bug at #20 and tested revDelete on old deleted revisions of this kind. It appears to fail persistently. However it also fails on deleted revisions that do have an ar_rev_id. So at the moment it's impossible to tell whether in fact that's the problem and (by itself) old deleted revisions with ar_rev_id = null are actually not a big problem. 

So I've noted this as an outstanding issue, and asked Werdna, when he's fixed RevDelete on deleted revisions that are later undeleted, to go back and recheck this one and see if that fixes this point as well. It's easier that way than waiting, then filing a new bug afterwards.
Comment 23 Happy-melon 2010-06-20 21:43:37 UTC
(In reply to comment #22)
> (In reply to comment #21)
> 
> 1/ I'm using this bug as a tracking bug for the few last issues Werdna needs to
> check, to put the "RevDelete log links" issues to bed at this point.  

The "button to mass-revdel deleted revisions is missing" issue is not related to this "RevDelete log links is borked" bug, nor is it something which will be fixed as a side-effect.  They're totally unconnected, and it just clutters things to lump them together.  

> 2/ I've tested the bug at #20 and tested revDelete on old deleted revisions of
> this kind. It appears to fail persistently. However it also fails on deleted
> revisions that do have an ar_rev_id. So at the moment it's impossible to tell
> whether in fact that's the problem and (by itself) old deleted revisions with
> ar_rev_id = null are actually not a big problem. 

Fair point :D

> So I've noted this as an outstanding issue, and asked Werdna, when he's fixed
> RevDelete on deleted revisions that are later undeleted, to go back and recheck
> this one and see if that fixes this point as well. It's easier that way than
> waiting, then filing a new bug afterwards.

I'm not saying that "log links might also break in *this* way" or "log links also break in *that* way" should be separate bugs; it's certainly reasonable to keep them here.  I'm saying that *totally separate issues* should be kept, well, separate... :D
Comment 24 DerHexer 2010-11-25 20:24:41 UTC
Log entries also break if the page was moved. You can only see what happened if you're looking for the “revision author” in Special:Log/suppress. It can neither be found by “user” nor by “title”.
Comment 25 FT2 2011-03-17 13:15:13 UTC
Anyone working on this - see bug 18780 which contains further discussion that could be relevant, or shed light on it.
Comment 26 Brion Vibber 2011-05-03 22:40:28 UTC
The original case for this bug appears to have pretty much been resolved with a workaround:

1) Adjust visibility for some revision of a page that exists
2) Delete the page (or delete the page, then undelete all but that revision -- in any case, end up with the rev you adjusted deleted)
3) Look at the log entries for the page

The link on the log page automatically changes from &type=revision&ids=<rev_id> to &type=archive&ids=<ar_timestamp> when this happens.

If you had saved the old link, then it won't work. But the link that's on the live page will update correctly, and it'll flip back again if you undelete the revision.

It doesn't seem to work the other way; if you adjusted visibility while deleted, then undelete it, the log entry doesn't update to point to the live revision.

And of course if you rename the page, all bets are off. Archived revisions are referenced by timestamp due to ar_rev_id being a new addition a few years ago, so -krrrrk- can't translate that.


I think the nicest thing to do would be to use &type=revision&ids=<rev_id|ar_rev_id> and save &type=archive for those with no saved ar_rev_id. These will be transportable across page moves and deletion/undeletion state changes, and no need to worry about them updating properly.
Comment 27 FT2 2011-05-04 00:43:45 UTC
(In reply to comment #26)
> If you had saved the old link, then it won't work. But the link that's on the
> live page will update correctly

That's the problem. The wiki is full of discussions of past matters, also active users often have to keep a personal note of any diffs related to possible abuse wherefuture resolution might be needed (this is expected/directed behavior: users are told to keep negative diffs off-wiki unless a case of some kind is imminent and often a case can only be opened after considerable "history" has happened). These diffs are exactly the kind that no longer work.
Comment 28 FT2 2011-05-04 00:51:49 UTC
A fairly straightforward workaround that would solve this (in almost all cases) without schema change is at bug 18104 comment 12.
Comment 29 Brion Vibber 2011-05-04 18:46:17 UTC
Bug 18104 comment 12 seems like it would make it easier to (temporarily?) undelete particular deleted revisions that you have a rev id for but not the title+timestamp. Could be used to undelete something, tweak the link, then re-delete it?

Could help, but I still think it'll be better to just use the revision ID references whenever they're available on this side (the rev deletion / change visibility links).

Will see if I can throw that together...
Comment 30 FT2 2011-05-04 20:49:55 UTC
(In reply to comment #29)
> Could be used to undelete something, tweak the link, then re-delete it?
> 
> Could help, but I still think it'll be better to just use the revision ID
> references whenever they're available on this side (the rev deletion / change
> visibility links).
> 
> Will see if I can throw that together...

The aim is "old links shouldn't break under deletion", rather than "the old linked pages can be traced and then the old pages referring to them all edited to contain updated links". I _think_ that's what you're saying....? :)


Bug 18104 comment 12 is saying that because deleted/undeleted revisions still have their oldid (the few old historical exceptions can be given an id as a once-off task) which is not lost during deletion, _therefore_:

1/ If an oldid is not found in current revisions, Mediawiki can quickly check the _deleted_ revisions table before reporting "revision not found". This would guarantee that links and diffs would always work transparently regardless of deletion/undeletion.

2/ Special pages such as [[Special:Undelete]] and [[Special:DeletedContributions]] could provide links based on oldid rather than page/timestamp, and these oldid based links would also be guaranteed to always work regardless of future deletion/undeletion, unlike page/timestamp based links which could fail.

3/ Because a revision can be found whether current or deleted, diffs can be generated between any two revisions, even if one or both revisions is subsequently deleted/undeleted. At present they cannot.

4/ Traceability despite deletion is made simpler. At present it can be difficult.

5/ Any future migration to using revision id as the sole means to locate a revision would be made easier but no schema change would be needed for the above.
Comment 31 Brion Vibber 2011-05-04 21:17:39 UTC
Speaking specifically to my plan for links in rev delete logs:


For any revision that's either existing in the revision table, or has an ar_rev_id value stored, we will output permanently working links in the form &type=revision&ids=<rev_id|ar_rev_id>.

This will apply to both old and new log entries, though old links that have been _copied_ will obviously remain in their original form.


New and old &type=revision&ids=<rev_id|ar_rev_id> links will always work in the future, by looking up each rev via both the revision & archive tables.

Old &type=archive&ids=<ar_timestamp> links should remain similarly workable for later-undeleted revisions *unless the page is renamed*, in which case we won't be able to reliably find it in the revision table.


Diff view, special:undelete, etc can benefit from similar improvements but they don't rely on each other directly.
Comment 32 FT2 2011-05-04 21:27:57 UTC
Should a separate request (dependant on this one) be set up for modifying the links that are displayed in other pages, where those are page/timestamp based?
Comment 33 Brion Vibber 2011-05-04 21:38:14 UTC
I'd say yes -- they'll be related in the underlying issue/goal, but keeping them separate will keep our eyes from glazing over as we run away from the giant bug report :D

Special:Undelete I think will need more work for a serious overhaul, while it looks like it should more isolated here.
Comment 34 FT2 2011-05-04 23:35:20 UTC
Done.  See bug 28819 ("Deleted revision links should use "rev_id" not
"page/timestamp") and bug 28820 ("Diffs using rev_id should work when one or
both revisions are deleted").
Comment 35 FT2 2011-05-04 23:46:45 UTC
And bug 28821 ("Special:Undelete to use same radio button styling for diffs as
history pages") - forgot that.
Comment 36 Platonides 2011-05-05 20:03:34 UTC
(In reply to comment #33)
> Special:Undelete I think will need more work for a serious overhaul, while it
> looks like it should more isolated here.

If you rework Special:Undelete, try to make it use Title->userCan() instead of User->isAllowed
That's something that has been waiting for months.
Comment 37 Brion Vibber 2011-05-10 01:20:14 UTC
This should cover most of the initial basics on trunk:

r87803 - fix for FakeResultWrapper

r87804 - the main stuff: allow Special:RevisionDelete to accept revision IDs for deleted revs, and keep using those same URLs in log entries.

r87805 - quick fix to Special:Undelete to use the revision IDs on links to Special:RevisionDelete, which should cover most cases generating those links. (Will still use type=archive for old items that have no rev id saved)

r87806 - quick fix for diff view to at least link to Special:Undelete on the error page for missing diff revisions. Needs more polishing (via bug 28820)

Not closing the bug yet as it'll need a little more tweaking to be fully nice.
Comment 38 db [inactive,noenotif] 2011-05-14 19:11:36 UTC
(In reply to comment #37)
> r87805 - quick fix to Special:Undelete to use the revision IDs on links to
> Special:RevisionDelete, which should cover most cases generating those links.
> (Will still use type=archive for old items that have no rev id saved)

What about links from Special:DeletedContributions?
Comment 39 Brion Vibber 2011-05-17 13:50:15 UTC
Good catch -- looks like that needs update as well. Probably a mass-search for revDeleteLink and revDeleteLinkDisabled will turn up more cases to merge -- should make sure they're all treated consistently.
Comment 41 Mark A. Hershberger 2011-07-13 18:38:27 UTC
krinkle, could you finish this up?
Comment 42 Krinkle 2011-07-13 20:58:27 UTC
Assigning back to brion. I have no clue around Revision Delete, I just ran some searches :-)
Comment 43 Aaron Schulz 2011-08-03 22:37:39 UTC
(In reply to comment #38)
> (In reply to comment #37)
> > r87805 - quick fix to Special:Undelete to use the revision IDs on links to
> > Special:RevisionDelete, which should cover most cases generating those links.
> > (Will still use type=archive for old items that have no rev id saved)
> 
> What about links from Special:DeletedContributions?

See r93860.
Comment 44 Mark A. Hershberger 2011-08-03 23:45:32 UTC
> See r93860.

Does that mean this can be closed now?
Comment 45 Mark A. Hershberger 2011-08-03 23:46:41 UTC
oops, re-open if “no” :P

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links