Last modified: 2013-01-29 17:59:51 UTC
The ETA generated for page text dumps is based on the total number of revisions in the project rather than on the number of revisions we are actually going to retrieve for the chunk (which is unknown; we don't know how many revisions are contained in the 2 million range of pageIDs we might be getting in one chunk file, and we would be waiting a *very* long time for a select count(*) to complete). Fix this to give some reasonable estimate.
(In reply to comment #0) > The ETA generated for page text dumps is based on the total number of > revisions in the project rather than on the number of revisions we are actually going Looking at http://dumps.wikimedia.org/backup-index.html I see that some processes don't even provide an ETA at all: 2013-01-29 17:37:42 cswiki: Dump in progress 2013-01-29 14:14:25 in-progress All pages with complete edit history (.7z) cswiki-20130129-pages-meta-history.xml.7z 1.0 GB (written) Is that intended? And to who is showing an ETA interesting?
The recompression steps, for example, don't show an eta because there is no simple way to make an estimate. The only steps with an eta are those that walk through XML, and based on page and revision info they generate an estimate. People watch these to find out when their favorite file is going to complete. I've also gotten bug reports based on the eta being much longer than it should be.