Last modified: 2011-11-29 03:20:56 UTC
On several occasions we've had corrupt .xml.bz2 files come out of the data dump process. There are several possible causes: * dbzip2 might be corrupting data * NFS filesystem transfers might be corrupting data * gremlins! Eliminating dbzip2 as a precaution, to see if this improves matters, would be a good start. Further checks for corrupt files would also be wise, however. Running a 'bzip2 -t' after generation (or even as a simultaneous side process?) may help to detect bad files and mark them appropriately. So far, manually re-running the dump produces a correct file; this could be automated if required.
In r33005 adjusted worker.py to pass dbzip2 mode to dumpTextPass.php only if configured to use dbzip2. Should use regular bzip2 mode for next dumps.
Just marking this fixed for now...