Last modified: 2014-08-29 08:26:42 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T58849, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 56849 - Edit conflict detection suffers a race condition
Edit conflict detection suffers a race condition
Status: ASSIGNED
Product: MediaWiki
Classification: Unclassified
Page editing (Other open bugs)
1.23.0
All All
: High major (vote)
: ---
Assigned To: Adam Wight
:
Depends on:
Blocks: 70163
  Show dependency treegraph
 
Reported: 2013-11-09 21:54 UTC by Adam Wight
Modified: 2014-08-29 08:26 UTC (History)
9 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Adam Wight 2013-11-09 21:54:18 UTC
In theory, we perform a lazy 1-second granularity timestamp check to detect conflicts, then a second test is performed atomically during updateRevisionOn.  However, the atomic test is broken because getLatest is returning the *new* latest id rather than the oldid stored at the time the user opened the article edit page.

This can be easily confirmed by disabling the timestamp test at EditPage.php line 1613.

My initial guess is that we were relying on the hidden "oldid" form parameter, but this always has a value of "0".
Comment 1 Gerrit Notification Bot 2013-11-09 22:30:04 UTC
Change 94584 had a related patch set uploaded by Adamw:
(bug 56849) Pass true oldid for atomic page save

https://gerrit.wikimedia.org/r/94584
Comment 2 Adam Wight 2013-11-10 18:14:12 UTC
The solution should go beyond the above patch.  Ideally, we remove the cheesy timestamp check and rely entirely on an atomic update attempt.  If that fails due to a conflict, perhaps WikiPage::doEditContent falls back to 3-way merge (moving that code down from EditPage), and then we attempt a second, atomic update.
Comment 3 Aaron Schulz 2013-11-13 18:41:45 UTC
Ideally we'd also avoid the SELECT FOR UPDATE (followed by the parsing) and just do CAS style logic at the last second. The advantage would be that if a page takes 10 seconds to render and two people save at the same time, it would take about 10 seconds for each rather than 20 right now (people have complained about it taking forever to even get edit conflict pages).
Comment 4 Jeroen De Dauw 2013-12-02 06:26:02 UTC
Since this has been linked to bug 57022, I tried applying the patch to my 1.21.2 wiki. Did not fix the edit conflict problem on this wiki.
Comment 5 Adam Wight 2014-01-19 07:56:10 UTC
A minor detail to add: we should always include the oldid in GET query parameters to action=edit.
Comment 6 Andre Klapper 2014-03-18 02:09:05 UTC
Patch didn't pass jenkins-bot.
Anybody planning to pick up the work on this again?
Comment 7 Andre Klapper 2014-05-05 11:29:18 UTC
Patch didn't pass jenkins-bot.
Anybody planning to pick up the work on this again?
Comment 8 Andre Klapper 2014-05-05 11:30:38 UTC
Plus wondering if fixing this would influence bug 53646, bug 53446, bug 42163.
Comment 9 Adam Wight 2014-05-12 09:53:12 UTC
@aklapper: Thanks for linking the related bugs!  I'm expanding my earlier patch to clean up oldid handling and implement compare-and-swap as @aschulz suggests.  I expect that will resolve bug 42163, but for the other two I can't say.  It would be very useful to have unit tests to detect improvement or regression of all four bugs.
Comment 10 Adam Wight 2014-05-18 05:53:21 UTC
I came across something disturbing while exploring the oldid fix: apparently, there is a 'redirect' parameter to the edit action that allows a page save to change the contents of the redirected page, rather than the contents of the redirect article passed as the 'title' parameter to edit.

Maybe I'm being willfully blind, but I cannot imagine a case where this is a good idea.

This seems like a nasty hurdle to reimplementing oldid, because a process making a 'redirect' edit will not necessarily have any information about the revision level of the redirect target.  These cases would have to fall back to the badly flawed timestamp check.
Comment 11 Adam Wight 2014-05-18 06:08:03 UTC
Aha: so the 'redirect' thing was introduced to satisfy bug 24330, but along the way the "appendtext" or "prependtext" requirement was lost in implementation, commit 9f025aeb.

IMO this is a misfeature, I'll write a separate patch to... break API backwards-compatibility... as part of this work.
Comment 12 Adam Wight 2014-05-18 08:44:10 UTC
@ASchulz: It looks like the CAS logic you suggested has already been implemented, in WikiPage::updateRevisionOn().  I'm trying to provide the $lastRevision param whenever possible, which should improve many cases.

There is another function updateIfNewerOn which does require two queries, but it's only used in SpecialUndelete and Import, and IMO should be retired.

Was there another query you know of which would explain the long load during a conflict?
Comment 14 Gerrit Notification Bot 2014-05-19 05:37:02 UTC
Change 134049 had a related patch set uploaded by Adamw:
(bug 56849) Deprecate dangerous edittime-based content update functions

https://gerrit.wikimedia.org/r/134049
Comment 15 Gerrit Notification Bot 2014-06-03 21:23:07 UTC
Change 134049 merged by jenkins-bot:
(bug 56849) Deprecate dangerous edittime-based content update functions

https://gerrit.wikimedia.org/r/134049
Comment 16 Andre Klapper 2014-06-18 10:04:15 UTC
(In reply to Adam Wight from comment #13)
> I'm still working on the main patch, but these predecessors are ready for
> review:
> 
> https://gerrit.wikimedia.org/r/#/c/133955 (also tracking under bug 24330)
> https://gerrit.wikimedia.org/r/#/c/134045
> https://gerrit.wikimedia.org/r/#/c/134046
> https://gerrit.wikimedia.org/r/#/c/134047
> https://gerrit.wikimedia.org/r/#/c/134049

For the records, all patches merged or abandoned. 
Not sure about the patch in comment 1 which didn't pass Jenkins.
Comment 17 Andre Klapper 2014-08-05 22:35:42 UTC
(In reply to Andre Klapper from comment #16)
> For the records, all patches merged or abandoned. 
> Not sure about the patch in comment 1 which didn't pass Jenkins.

Adam: What's the status here? 
Is more work needed, or can this be considered fixed?
Comment 18 Adam Wight 2014-08-08 23:38:22 UTC
This patch is mean to fix the issue:
https://gerrit.wikimedia.org/r/#/c/94584/

Everything else has been a preliminary... Sorry!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links