Last modified: 2012-09-19 20:56:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T36717, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 34717 - [performance regression]: Web server responds very slow to action=delete submissions in the File namespace
[performance regression]: Web server responds very slow to action=delete subm...
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: High major (vote)
: ---
Assigned To: Ben Hartshorne
: code-update-regression
Depends on:
Blocks: 36664
  Show dependency treegraph
 
Reported: 2012-02-25 21:40 UTC by Saibo
Modified: 2012-09-19 20:56 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Graph showing response time for swift container listings (58.62 KB, image/png)
2012-07-10 20:21 UTC, Ben Hartshorne
Details

Description Saibo 2012-02-25 21:40:27 UTC
For some reason apparently logged actions get done twice...  https://commons.wikimedia.org/w/index.php?title=Special:Log&page=File%3AAfrican+Cup+of+Nations+2012.png I just reverted once: two reverts. 
... I think I did not click twice.. ;-)

Also noted: the revert takes ages (~ 30 seconds  to  2:30 minutes) to return (browser waits...). The long revert duration was confirmed three times (with two different browsers (FF10 and Opera11.6) and accounts) by me at https://commons.wikimedia.org/w/index.php?title=File:AaatestSonnepalmenstrand-portrait_new.jpg&action=history

Reproduce: Just do a revert to a previous file version at this test file and see what you get.   Note that the revert is done "fast".  Just it takes ages for the server feedback.

In case it is a Europe-only problem: I am in Germany.
Comment 1 Platonides 2012-02-25 22:14:15 UTC
It does seem a bit slow: [23:06:40.415] POST https://commons.wikimedia.org/wiki/File:AaatestSonnepalmenstrand-portrait_new.jpg [HTTP/1.1 200 OK 22412ms] (and irc entry at 23:07:02)

I can't reproduce the "double actions", though.
Comment 2 Tim Starling 2012-02-27 03:16:35 UTC
In theory, a POST operation should only be attempted once, not retried, regardless of how long it takes. But maybe something in the network path here is getting impatient and sending the request again. It could be on our side, in the ISP or Saibo's browser. But since that's not the subject of the bug, it should be filed as a separate report if it's a problem.
Comment 3 Aaron Schulz 2012-03-01 22:25:10 UTC
https://commons.wikimedia.org/w/index.php?title=File:AaatestSonnepalmenstrand-portrait_new.jpg&action=purge&forceprofile=true

Purge is seems to be fast for now, though I thought it was slow earlier.
Comment 4 Aaron Schulz 2012-03-01 23:25:46 UTC
I just tried upload, reupload, and revert on testwiki and it's fairly fast.
Comment 5 Saibo 2012-03-01 23:30:35 UTC
(In reply to comment #3)
> https://commons.wikimedia.org/w/index.php?title=File:AaatestSonnepalmenstrand-portrait_new.jpg&action=purge&forceprofile=true
> 
> Purge is seems to be fast for now, though I thought it was slow earlier.

Purge is okay - not fast - but about 10 seconds.

(In reply to comment #4)
> I just tried upload, reupload, and revert on testwiki and it's fairly fast.

on testwiki?! Sorry, that is out of relevance for me. ;-) But it seems to have returned to normal at Commons again. We can close this... unresolved...
Comment 6 Aaron Schulz 2012-03-02 00:43:50 UTC
Yeah, it's not really 'fast' so much as 'not terrible' ;)
Comment 7 Mark A. Hershberger 2012-03-02 18:28:17 UTC
From https://en.wikipedia.org/w/index.php?diff=479777598

Currently hitting the delete key on the image does not return anything for over 20-30 seconds, often resulting in "wikimedia foundation error..." The tech details are oft similar to "Request: POST http://en.wikipedia.org/w/index.php?title=File:Logo_FEPLP.jpg&action=delete, from 208.80.154.8 via cp1007.eqiad.wmnet (squid/2.7.STABLE9) to 10.64.0.140 (10.64.0.140) Error: ERR_READ_TIMEOUT, errno [No Error] at Fri, 02 Mar 2012 06:09:30 GMT". The 208.80 is familiar, but the 10.64 isn't (Durham, NC?). A ping from me to 208.80 is consistently 99ms, so it doesn't seem to be a timeout from this end.
Comment 8 Aaron Schulz 2012-03-28 00:47:52 UTC
Is this in faster after the changes for bug 35047?
Comment 9 Aaron Schulz 2012-04-09 22:14:12 UTC
(In reply to comment #8)
> Is this in faster after the changes for bug 35047?

Pinging this bug for more info.
Comment 10 Christian Nagy 2012-04-29 11:19:44 UTC
The problems are still persisting, when deleting a page I'm frequently returned to the '&action=delete' page and have to confirm the action one more time (or I'm just getting the error mentioned by Mark above). Other pages such as my watchlist are loading fairly slow.

I'm experiencing these problems *just* on Commons, please note that I also access from *Germany*.
Comment 11 Krinkle 2012-05-09 19:38:30 UTC
(updating target milestone from 1.20wmf1 to 1.20wmf3, 1.20wmf has been deployed already)
Comment 12 Krinkle 2012-06-11 17:14:41 UTC
I just went through a few copyright violations on Wikimedia Commons. Files with only 1 entry in the file history (the original upload), and about 5 page revisions.

Submitting the delete action took almost 20 seconds for any of them to respond (the POST request to get a html response).

I noticed in the recent changes feed in another tab that the action itself was processed long before the server responded with the rest.
Comment 13 Platonides 2012-06-15 17:21:26 UTC
I just deleted one file. One image revision, two history entries, no usage.
"Served by mw4 in 18.267 secs"

Normal (uncached) page views take less than 1 second...
Comment 14 Rob Lanphier 2012-06-15 18:49:43 UTC
This hasn't fallen off of our radar, it's just that we haven't communicated very well about status.  Ben, Aaron and I discussed this one last week, and we have some ideas to try.  I believe Ben tried something last week that unfortunately didn't help.  We have some hardware on order that we think is the correct fix, but we've also got more ideas on short-term solutions.  Sorry for the inconvenience here.
Comment 15 Platonides 2012-06-15 22:15:28 UTC
Have you profiled it? What step is being slow? swift deletion?
Comment 16 Rob Lanphier 2012-06-29 18:36:25 UTC
The part that's slow is the container listing, which seems to be slow because the listing is getting paged out of memory.  With the SSDs that Ben is installing, the container listings should load much faster off of disk.  Assigning this bug to Ben, since this is mostly out of Aaron's hands now.
Comment 17 Platonides 2012-06-30 10:12:04 UTC
Container listing? You mean listing the contents of a folder?
We don't have so much files... It seems too slow even if it needs to read it from disk.
Comment 18 Ben Hartshorne 2012-07-10 20:21:51 UTC
Created attachment 10838 [details]
Graph showing response time for swift container listings
Comment 19 Ben Hartshorne 2012-07-10 20:24:26 UTC
We have moved container listings to SSDs within the swift backend storage nodes.  Over the past week (moving the containers little by little) 50th percentile response time dropped from ~2s to ~0.12s and 90th % dropped from 30s to 0.9s.  The container listing times were the largest contributor to sporadic long delete times; this change should eliminate that delay.  I've attached the graph showing this change (from graphite).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links