Last modified: 2011-11-29 03:20:55 UTC
Download files are large, we all know it. Downloading them repeatedly sucks [bandwidth]. First I wanted to ask about rsync, whether you would enable it. It is the widespread way to save bits on the wire. It could help, especially if you compress files with 'gzip --rsyncable' (seems to be supported by newer gzips, at least on Debian). Saves a lot, minimal average told to be 10%. Problem is that rsync requires resources on the server side, mainly for creating the delta data, and maybe download.wikimedia.org doesn't possess surplus resources. (The larger the file, the more resource it consumes, I believe.) Then it struck me that I've seen something with the gains of rsync without its problems: zsync. Google for zsync and feel lucky :). It is part of Debian, and maybe other distros. Features: * _no_ server or shell required * uses statically generated delta data, which resides in a single, plain file * works on any HTTP/1.1 webserver (uses partial transfers) * sexy (eg. saves plenty of bandwidth) Basically if you would be so mighty kind as to support zsync, the only thing you would have to do is to generate .zsync files for the downloadable stuff (zsyncmake -u http://download.wikimedia.org/wikipedia/hu/cur_table.sql.gz cur_table.sql.gz) and put it next to the file. That's all, the rest is the clients' business. zsync files seem to be 1/300 of the size of the original, and generating took 5 sec for 100MB file on my average machine. Pretty Please with Sugar on the Top, Cream and Cherries? (<-- quote from The Monkey Island :)) Thanks. [[:hu:user:grin]]
Since the biggest of our dump files are compressed with things much more serious than gzip and aren't friendly to rsync/zsync, gonna WONTFIX this. :(
Changing all WONTFIX high priority bugs to lowest priority (no mail should be generated since I turned it off for this.)