Last modified: 2009-07-25 18:29:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T20201, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 18201 - Upload-by-URL should enforce $wgMaxUploadSize early when Content-Length header provided
Upload-by-URL should enforce $wgMaxUploadSize early when Content-Length heade...
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Uploading (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Michael Dale
http://test.wikipedia.org/wiki/Specia...
:
Depends on:
Blocks: 18563
  Show dependency treegraph
 
Reported: 2009-03-27 17:54 UTC by Brion Vibber
Modified: 2009-07-25 18:29 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Brion Vibber 2009-03-27 17:54:19 UTC
Currently upload-by-URL enforces $wgMaxUploadSize by counting up bytes as it downloads, then aborting when it reaches the maximum.

This could potentially be a long time... CURL should be able to give us the HTTP header values including any Content-Length header that may have been provided long before we get to this point, in which case we could abort immediately.

Ideally this ability should fold into Http class.
Comment 1 Michael Dale 2009-04-14 18:03:51 UTC
working on this in the new-upload branch... I will try and add in a head request using the http class to do some early detection but in cases where we don't have a content length in the http header we will have to count bytes as it downloads. 

Also the architecture has to change a bit... we have to spin off the action into a separate command line php process to monitor the curl copy and update the memchached (or database if memchace not installed)  then the client does ajax requests and gets updates as to how far along the transfer is.. the spin off process then actually creates the resource page and informs the client its ready. Will keep things insync passing the session key to the process (unless that is a bad way) in which case what would be a good way? 

We have to spin it off into a separate process cuz our php execution times out in 30 seconds. 

This also involves rewriting the Special:Upload page for http requests and doing a little ajaxy interface for progress. (can use the same ajax progress indicator interface that we are using for firefogg upload progress). 

But that hits on the same theme of getting jQuery into core. Which will speed up the interfaces for all these enhancements. 
Comment 2 Michael Dale 2009-04-28 22:37:56 UTC
(fixed) we first do a HEAD request then if less than $wgMaxUploadSize byte size we continue. We then use the CURL writeBodyCallBack function to write to a file .. if the file grows > $wgMaxUploadSize we break out as well. 
Comment 3 Brion Vibber 2009-05-08 19:17:32 UTC
That should work great for most static files. :D

I don't think Content-Length is a required header though; if it's not present, the current code in branch will spew a notice error at "if($head['Content-Length'] > $wgMaxUploadSize){"
Comment 4 Michael Dale 2009-05-20 23:17:25 UTC
fixed Content-Length a while back (just catching up on backlog)

The pre-download checks also looks for redirects makes sure they are a valid url and enforces the $wgMaxRedirects global variable
Comment 5 Michael Dale 2009-07-25 18:29:53 UTC
fixed. Also the curl headers was added in r53620

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links