Last modified: 2011-11-29 03:20:57 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T20808, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 18808 - Page Logging portion of XML Dump is broken
Page Logging portion of XML Dump is broken
Status: RESOLVED FIXED
Product: Datasets
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal (vote)
: ---
Assigned To: Tomasz Finc
http://download.wikipedia.org/backup-...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-05-15 18:37 UTC by Tomasz Finc
Modified: 2011-11-29 03:20 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Tomasz Finc 2009-05-15 18:37:33 UTC
The portion of the XML dump that logs events to all pages has been broken for a while now. It's generating empty gzip files.
Comment 1 Tomasz Finc 2009-05-15 19:04:01 UTC
This might be a superfluous step as logging.xml.gz seemingly has all the content of what this step is trying to provide. I'm following up with Aaron Schulz to confirm.
Comment 2 Aaron Schulz 2009-05-15 20:36:59 UTC
I mentioned this a while ago. It probably is due to some second pass. The dumping works fine in my regular test environment.
Comment 3 Tomasz Finc 2009-05-15 20:56:06 UTC
(In reply to comment #2)
> I mentioned this a while ago. It probably is due to some second pass. The
> dumping works fine in my regular test environment.
> 

Can you clarify what the intention is between what the first logging pass does vs. the second? Just trying to make sure
I understand why there are two passes.
Comment 4 Aaron Schulz 2009-05-15 21:16:26 UTC
Really there should only be one pass, all the data is there already. Other passes would be the result of bundling the code with the text dumps, which actually do need two passes.
Comment 5 Aaron Schulz 2009-05-15 21:24:16 UTC
So the XmlDump("logging",...) call probably can go (in worker.py), since the stub call does it (though that is a bit misnamed then)
Comment 6 Tomasz Finc 2009-05-23 01:36:36 UTC
I went ahead and just moved it to its own class to keep it clean and less confusing. Patch is ready and I'll check in the fix later tonight after running a couple more iterations of the dumps.
Comment 7 Tomasz Finc 2009-05-26 03:59:21 UTC
Bugfix checked into r51000 and code is now live. First backup to run with the new code was

http://download.wikimedia.org/eswikibooks/20090526/

and eswikibooks-20090526-pages-logging.xml.gz is complete.

Resolving.


Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links