Last modified: 2009-07-25 18:29:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 18201 - Upload-by-URL should enforce $wgMaxUploadSize early when Content-Length header provided
Upload-by-URL should enforce $wgMaxUploadSize early when Content-Length heade...
Product: MediaWiki
Classification: Unclassified
Uploading (Other open bugs)
All All
: Normal enhancement (vote)
: ---
Assigned To: Michael Dale
Depends on:
Blocks: 18563
  Show dependency treegraph
Reported: 2009-03-27 17:54 UTC by Brion Vibber
Modified: 2009-07-25 18:29 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Brion Vibber 2009-03-27 17:54:19 UTC
Currently upload-by-URL enforces $wgMaxUploadSize by counting up bytes as it downloads, then aborting when it reaches the maximum.

This could potentially be a long time... CURL should be able to give us the HTTP header values including any Content-Length header that may have been provided long before we get to this point, in which case we could abort immediately.

Ideally this ability should fold into Http class.
Comment 1 Michael Dale 2009-04-14 18:03:51 UTC
working on this in the new-upload branch... I will try and add in a head request using the http class to do some early detection but in cases where we don't have a content length in the http header we will have to count bytes as it downloads. 

Also the architecture has to change a bit... we have to spin off the action into a separate command line php process to monitor the curl copy and update the memchached (or database if memchace not installed)  then the client does ajax requests and gets updates as to how far along the transfer is.. the spin off process then actually creates the resource page and informs the client its ready. Will keep things insync passing the session key to the process (unless that is a bad way) in which case what would be a good way? 

We have to spin it off into a separate process cuz our php execution times out in 30 seconds. 

This also involves rewriting the Special:Upload page for http requests and doing a little ajaxy interface for progress. (can use the same ajax progress indicator interface that we are using for firefogg upload progress). 

But that hits on the same theme of getting jQuery into core. Which will speed up the interfaces for all these enhancements. 
Comment 2 Michael Dale 2009-04-28 22:37:56 UTC
(fixed) we first do a HEAD request then if less than $wgMaxUploadSize byte size we continue. We then use the CURL writeBodyCallBack function to write to a file .. if the file grows > $wgMaxUploadSize we break out as well. 
Comment 3 Brion Vibber 2009-05-08 19:17:32 UTC
That should work great for most static files. :D

I don't think Content-Length is a required header though; if it's not present, the current code in branch will spew a notice error at "if($head['Content-Length'] > $wgMaxUploadSize){"
Comment 4 Michael Dale 2009-05-20 23:17:25 UTC
fixed Content-Length a while back (just catching up on backlog)

The pre-download checks also looks for redirects makes sure they are a valid url and enforces the $wgMaxRedirects global variable
Comment 5 Michael Dale 2009-07-25 18:29:53 UTC
fixed. Also the curl headers was added in r53620

Note You need to log in before you can comment on or make changes to this bug.