Last modified: 2013-06-18 16:20:38 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3298, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1298 - A "current" version of the upload.tar dir would be nice
A "current" version of the upload.tar dir would be nice
Status: RESOLVED FIXED
Product: Datasets
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Ariel T. Glenn
http://dumps.wikimedia.org/
:
Depends on:
Blocks: 27939
  Show dependency treegraph
 
Reported: 2005-01-09 22:09 UTC by Hendrik Brummermann
Modified: 2013-06-18 16:20 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Hendrik Brummermann 2005-01-09 22:09:28 UTC
A special version of upload.tar which only includes the most recent version of
files would be nice. Perheps only including files that are linked from the
current version of the articles.
Comment 1 Ævar Arnfjörð Bjarmason 2005-06-27 09:19:51 UTC
Regular upload dumps are now provided, marking this as FIXED.
Comment 2 Hendrik Brummermann 2005-06-27 17:58:11 UTC
Reopening: This feature-request is not about a recent dump but about a dump
containing only the most recent versions of files.
Comment 3 Brion Vibber 2009-03-30 21:30:16 UTC
While actual .tar files are probably not feasible at our current level (~3TB for Commons files current versions only), getting some offsite image mirrors and redistribution is on the table. Tomasz, assigning this one to you since you'll be coordinating the data dump stuff.
Comment 4 Tomasz Finc 2011-08-03 19:41:59 UTC
Releasing this bug so that anyone who has time can take it on.
Comment 5 Ariel T. Glenn 2012-06-03 04:47:24 UTC
These are now semi-available (I'm running them on an ad hoc basis, they are generated on a mirror site rather than one of our servers, we're still working out hardware issues with them, etc etc.)  If you're willing to deal with directories moving around and possible inaccessibility, you can get these before the official announcement, from http://ftpmirror.your.org/pub/wikimedia/imagedumps/  in the tarballs/full directory and the tarballs/incrs directory.  These are indeed current version only, per project *except* for commons.  

If you want commons images, you should get them via rsync from rsync://ftpmirror.your.org/wikimedia-images/  and please see http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dumps#Current_Mirrors for more information about what data is mirrored where.

Anyone on this bug that's not on the xmldatadumps-l list had better get on it, since that's generally where updates about this sort of thing will be sent.
Comment 6 Ariel T. Glenn 2012-06-18 12:57:12 UTC
Hmm I guess since the official announcement went out we can call this done, or close enough to done at any rate.  (Everyone on the xmldatadumps-l list yet??)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links