Last modified: 2011-02-06 15:35:55 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 14631 - Include uncompressed sizes in dump file RSS info
Include uncompressed sizes in dump file RSS info
Status: NEW
Product: Datasets
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Ariel T. Glenn
http://download.wikipedia.org/enwikip...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-06-24 17:15 UTC by Andrew Dunbar
Modified: 2011-02-06 15:35 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Andrew Dunbar 2008-06-24 17:15:50 UTC
It would be handy to have info in the RSS feeds on the full size in bytes of each compressed file.

Bzip2 archives in particular provide no interface for revealing a files uncompressed size.

I'm working on a Firefox extension which could use this information unavailable elsewhere to provide a progress bar when decompressing large files.

The current RSS feeds are very minimal and there is plenty of space to include the extra information which should be trivially available to the scripts which create the dump download areas.
Comment 1 Brion Vibber 2008-06-24 18:26:30 UTC
Note that uncompressed size is not currently available. It should be possible to make a little wrapper tool to pipe the data through before compression which will count bytes and save it, which could then be pulled to the report & RSS outputs.
Comment 2 Melancholie 2008-06-25 03:59:28 UTC
Sorry for that silly question, but where can I find this RSS feed?
There is no feed <link>ed at download.wikimedia.org.
Comment 3 Andrew Dunbar 2008-06-25 17:22:46 UTC
There is one feed per file per project. Oddly they are in a place whch doesn't seem to have any links to the outside world. I had to ask people on the dev IRC channel to find out about it:

http://download.wikipedia.org/enwikipedia/latest/

Comment 4 Andrew Dunbar 2008-06-28 13:55:36 UTC
For bzip2 files at least the uncompressed file size is available without the wrapper tool Brion suggests. Simply providing the -v switch will provide the details to stderr. I don't yet grok the code in backuup/worker.py but it should be easy to parse the verbose reply. Example follows:

  (stdin):  1.512:1,  5.291 bits/byte, 33.87% saved, 688 in, 455 out.
Comment 5 Andrew Dunbar 2009-05-16 11:22:31 UTC
See also bug 6064

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links