Last modified: 2014-11-19 22:14:54 UTC
Generate data on the cluster, move it on labsdbs for wikimetrics or make it directly available to Vital Signs There are 2 metrics: Monthly Pageviews and Daily Pageviews. They are broken down by project (wiki) and by target-site (same definition of target site as in bug https://bugzilla.wikimedia.org/show_bug.cgi?id=72737
Collaborative tasking on etherpad: http://etherpad.wikimedia.org/p/analytics-72740
Change 169974 had a related patch set uploaded by Milimetric: Transform projectcounts hourly files https://gerrit.wikimedia.org/r/169974
Change 172194 had a related patch set uploaded by QChris: Add basic python setup https://gerrit.wikimedia.org/r/172194
Change 172195 had a related patch set uploaded by QChris: Add basic implementation of projectcount aggregation https://gerrit.wikimedia.org/r/172195
Change 172196 had a related patch set uploaded by QChris: Add basic monitoring script for projectcount aggregates https://gerrit.wikimedia.org/r/172196
Change 172197 had a related patch set uploaded by QChris: Allow additional logging to disk for projectcounts aggregation https://gerrit.wikimedia.org/r/172197
Change 172198 had a related patch set uploaded by QChris: Add switch for automatic pushing of data repo for projectcounts aggregation https://gerrit.wikimedia.org/r/172198
Change 172181 had a related patch set uploaded by QChris: Create empty CSVs for relevant wikis https://gerrit.wikimedia.org/r/172181
Change 172182 had a related patch set uploaded by QChris: Backfill daily projectcounts for 2008 https://gerrit.wikimedia.org/r/172182
Change 172183 had a related patch set uploaded by QChris: Backfill daily projectcounts for 2009 https://gerrit.wikimedia.org/r/172183
Change 172185 had a related patch set uploaded by QChris: Backfill daily projectcounts for 2010 https://gerrit.wikimedia.org/r/172185
Change 172186 had a related patch set uploaded by QChris: Backfill daily projectcounts for 2011 https://gerrit.wikimedia.org/r/172186
Change 172187 had a related patch set uploaded by QChris: Backfill daily projectcounts for 2012 https://gerrit.wikimedia.org/r/172187
Change 172188 had a related patch set uploaded by QChris: Backfill daily projectcounts for 2013 https://gerrit.wikimedia.org/r/172188
Change 172189 had a related patch set uploaded by QChris: Backfill daily projectcounts for 2014 up to 2014-09-22 https://gerrit.wikimedia.org/r/172189
Change 172190 had a related patch set uploaded by QChris: Backfill daily projectcounts up to 2014-11-08 https://gerrit.wikimedia.org/r/172190
Change 172201 had a related patch set uploaded by QChris: Add jobs for aggregating hourly projectcount files to daily per wiki csvs https://gerrit.wikimedia.org/r/172201
Change 172285 had a related patch set uploaded by QChris: Link aggregator dataset into wikimetrics public webspace https://gerrit.wikimedia.org/r/172285
Per discussions last week, we changed the scope to daily page views only. Monthly page views is another story.
Change 172194 merged by Nuria: Add basic python setup https://gerrit.wikimedia.org/r/172194
Change 172195 merged by jenkins-bot: Add basic implementation of projectcount aggregation https://gerrit.wikimedia.org/r/172195
Change 172196 merged by jenkins-bot: Add basic monitoring script for projectcount aggregates https://gerrit.wikimedia.org/r/172196
Change 172197 merged by jenkins-bot: Allow additional logging to disk for projectcounts aggregation https://gerrit.wikimedia.org/r/172197
Change 172198 merged by jenkins-bot: Add switch for automatic pushing of data repo for projectcounts aggregation https://gerrit.wikimedia.org/r/172198
Change 172186 merged by Mforns: Backfill daily projectcounts for 2011 https://gerrit.wikimedia.org/r/172186
Change 172185 merged by Mforns: Backfill daily projectcounts for 2010 https://gerrit.wikimedia.org/r/172185
Change 172183 merged by Mforns: Backfill daily projectcounts for 2009 https://gerrit.wikimedia.org/r/172183
Change 172182 merged by Mforns: Backfill daily projectcounts for 2008 https://gerrit.wikimedia.org/r/172182
Change 172181 merged by Mforns: Create empty CSVs for relevant wikis https://gerrit.wikimedia.org/r/172181
Change 172190 merged by Mforns: Backfill daily projectcounts up to 2014-11-08 https://gerrit.wikimedia.org/r/172190
Change 172189 merged by Mforns: Backfill daily projectcounts for 2014 up to 2014-09-22 https://gerrit.wikimedia.org/r/172189
Change 172188 merged by Mforns: Backfill daily projectcounts for 2013 https://gerrit.wikimedia.org/r/172188
Change 172187 merged by Mforns: Backfill daily projectcounts for 2012 https://gerrit.wikimedia.org/r/172187
Interesting! Will the CSVs end up on http://datasets.wikimedia.org/aggregate-datasets/ or elsewhere? Wherever they end up, please add a link from the raw files (under http://dumps.wikimedia.org/other/ ).
The cvs will be in a repo that we were thinking of checking out under wikimetrics directory for the time coming but with a git clone so anyone can check it anywhere. At this time we are vetting out the data, though.
Change 172285 merged by Ottomata: Link aggregator dataset into wikimetrics public webspace https://gerrit.wikimedia.org/r/172285