Last modified: 2011-11-29 03:20:55 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 2923 - More optimal download of large files / zsync / rsync
More optimal download of large files / zsync / rsync
Status: RESOLVED WONTFIX
Product: Datasets
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Lowest enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-07-21 09:51 UTC by Peter Gervai (grin)
Modified: 2011-11-29 03:20 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Peter Gervai (grin) 2005-07-21 09:51:35 UTC
Download files are large, we all know it. Downloading them repeatedly sucks
[bandwidth]. 

First I wanted to ask about rsync, whether you would enable it. It is the
widespread way to save bits on the wire. It could help, especially if you
compress files with 'gzip --rsyncable' (seems to be supported by newer gzips, at
least on Debian). Saves a lot, minimal average told to be 10%. 

Problem is that rsync requires resources on the server side, mainly for creating
the delta data, and maybe download.wikimedia.org doesn't possess surplus
resources. (The larger the file, the more resource it consumes, I believe.)

Then it struck me that I've seen something with the gains of rsync without its
problems: zsync. Google for zsync and feel lucky :). It is part of Debian, and
maybe other distros. 

Features:
* _no_ server or shell required
* uses statically generated delta data, which resides in a single, plain file
* works on any HTTP/1.1 webserver (uses partial transfers)
* sexy (eg. saves plenty of bandwidth)

Basically if you would be so mighty kind as to support zsync, the only thing you
would have to do is to generate .zsync files for the downloadable stuff
(zsyncmake -u http://download.wikimedia.org/wikipedia/hu/cur_table.sql.gz
cur_table.sql.gz) and put it next to the file. That's all, the rest is the
clients' business. 

zsync files seem to be 1/300 of the size of the original, and generating took 5
sec for 100MB file on my average machine.

Pretty Please with Sugar on the Top, Cream and Cherries? (<-- quote from The
Monkey Island :))

Thanks.
[[:hu:user:grin]]
Comment 1 Brion Vibber 2009-03-30 21:36:44 UTC
Since the biggest of our dump files are compressed with things much more serious than gzip and aren't friendly to rsync/zsync, gonna WONTFIX this. :(
Comment 2 Mark A. Hershberger 2011-03-13 17:45:58 UTC
Changing all WONTFIX high priority bugs to lowest priority (no mail should be generated since I turned it off for this.)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links