Last modified: 2013-01-29 17:59:51 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T29115, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 27115 - ETA for jobs that run in parallel is completely wrong
ETA for jobs that run in parallel is completely wrong
Status: NEW
Product: Datasets
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal minor (vote)
: ---
Assigned To: Ariel T. Glenn
:
Depends on:
Blocks: 27110
  Show dependency treegraph
 
Reported: 2011-02-02 19:58 UTC by Ariel T. Glenn
Modified: 2013-01-29 17:59 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ariel T. Glenn 2011-02-02 19:58:07 UTC
The ETA generated for page text dumps is based on the total number of revisions in the project rather than on the number of revisions we are actually going to retrieve for the chunk (which is unknown; we don't know how many revisions are contained in the 2 million range of pageIDs we might be getting in one chunk file, and we would be waiting a *very* long time for a select count(*) to complete).  Fix this to give some reasonable estimate.
Comment 1 Andre Klapper 2013-01-29 17:39:20 UTC
(In reply to comment #0)
> The ETA generated for page text dumps is based on the total number of
> revisions in the project rather than on the number of revisions we are actually going

Looking at   http://dumps.wikimedia.org/backup-index.html I see that some processes don't even provide an ETA at all:

2013-01-29 17:37:42 cswiki: Dump in progress
    2013-01-29 14:14:25 in-progress All pages with complete edit history (.7z)
        cswiki-20130129-pages-meta-history.xml.7z 1.0 GB (written) 

Is that intended? And to who is showing an ETA interesting?
Comment 2 Ariel T. Glenn 2013-01-29 17:59:51 UTC
The recompression steps, for example,  don't show an eta because there is no simple way to make an estimate.  The only steps with an eta are those that walk through XML, and based on page and revision info they generate an estimate.

People watch these to find out when their favorite file is going to complete.  I've also gotten bug reports based on the eta being much longer than it should be.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links