Last modified: 2014-10-31 12:53:16 UTC
The two bits partitions [1] on 2014-10-08T1[89]:xx:xx, were not marked successful. What happened? [1] _________________________________________________________________ qchris@stat1002 // jobs: 0 // time: 13:05:11 // exit code: 0 cwd: ~/cluster-scripts ./dump_webrequest_status.sh +---------------------+--------+--------+--------+--------+ | Date | bits | text | mobile | upload | +---------------------+--------+--------+--------+--------+ [...] | 2014-10-08T16:xx:xx | . | . | . | . | | 2014-10-08T17:xx:xx | . | . | . | . | | 2014-10-08T18:xx:xx | X | . | . | . | | 2014-10-08T19:xx:xx | X | . | . | . | | 2014-10-08T20:xx:xx | . | . | . | . | | 2014-10-08T21:xx:xx | . | . | . | . | [...] +---------------------+--------+--------+--------+--------+ Statuses: . --> Partition is ok X --> Partition is not ok (duplicates, missing, or nulls)
Since checking missings across hour boundaries does not make the missing go away [1], it does not seem to be the race condition described in bug 69615. Since it seems only esams bits are affected, it might be another instance of bug 71435. More investigation needed. [1] _________________________________________________________________ qchris@stat1002 // jobs: 0 // time: 12:55:38 // exit code: 0 cwd: ~/refinery hive -f two_hour_stats.hql -d table=wmf_raw.webrequest -d webrequest_source=bits -d year=2014 -d month=10 -d day=8 -d hourA=18 -d hourB=19 [...] Total MapReduce CPU Time Spent: 0 days 2 hours 44 minutes 50 seconds 570 msec OK hostname sequence_min sequence_max count_actual count_expected count_different count_duplicate count_null_sequence percent_different cp3020.esams.wikimedia.org 2135978184 2170880606 34085296 34902423 817127 0 0 -2.3411755682406348 cp3022.esams.wikimedia.org 2389095903 2423994513 34670200 34898611 228411 0 0 -0.6544988280479128 cp3019.esams.wikimedia.org 2137887090 2172774146 34532139 34887057 354918 0 0 -1.0173343082507649 Time taken: 265.71 seconds, Fetched: 3 row(s)