Last modified: 2014-07-15 13:15:12 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T39463, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 37463 - Key performance indicator: Gerrit review queue


Summary:	Key performance indicator: Gerrit review queue

Status:	ASSIGNED

Product:	Analytics
Classification:	Unclassified
Component:	Tech community metrics (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	High enhancement
Target Milestone:	---
Assigned To:	Quim Gil

URL:
Whiteboard:
Keywords:

Depends on:	57038 58349 58424 58426 58428
Blocks:
	Show dependency tree / graph

Reported:	2012-06-11 14:19 UTC by Sam Reed (reedy)
Modified:	2014-07-15 13:15 UTC (History)
CC List:	15 users (show)

See Also:	53485
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Sam Reed (reedy) 2012-06-11 14:19:02 UTC

We need some sort of statistics for the backlogs of review

CodeReview had some basic stats, that gave these sorts of numbers [1]

The simplest useful statistic would be number of unmerged (and not abandoned) revisions per git repository



[1] https://www.mediawiki.org/wiki/Special:Code/MediaWiki/stats

Comment 1 Alex Monk 2012-06-11 15:47:30 UTC

Number of open changes per project:

username = 'krenair'
import os, json, operator
lines = os.popen('ssh ' + username + '@gerrit.wikimedia.org -p 29418 gerrit query status:open --format json').read().splitlines()
projects = {}
for line in lines:
    data = json.loads(line)
    if 'project' not in data:
        continue
    elif data['project'] not in projects:
        projects[data['project']] = 1
    else:
        projects[data['project']] += 1

for project, unmergedcount in sorted(projects.iteritems(), key=operator.itemgetter(1), reverse=True):
    print project, unmergedcount

Comment 2 Sam Reed (reedy) 2012-06-11 18:00:51 UTC

I forgot we had people working on this (ie Diederik)

https://gerrit.wikimedia.org/r/#/c/9627/

Comment 3 Dereckson 2012-06-12 08:20:51 UTC

What's the Nagios URL?

Comment 4 Chad H. 2013-02-12 17:48:32 UTC

How about Gitblit?

For a repo:
https://gerrit.wikimedia.org/r/plugins/gitblit/metrics/?r=mediawiki/core.git

Overall:
https://gerrit.wikimedia.org/r/plugins/gitblit/activity/

Comment 5 Andre Klapper 2013-02-13 06:50:35 UTC

(In reply to comment #4)
> How about Gitblit?

For the records: depends on bug 38383.

Comment 6 Quim Gil 2013-06-28 21:20:32 UTC

We are planning to have Gerrit statistics soon-ish. Álvaro del Castillo (CCed) knows the details, and if you want you can assign this report to him. See

http://www.mediawiki.org/wiki/Community_metrics#Metrics_dashboard

http://blog.bitergia.com/2013/06/06/first-steps-mining-gerrit/

Comment 7 Quim Gil 2013-08-22 16:37:05 UTC

See the first iteration:

http://korma.wmflabs.org/browser/scr.html
http://korma.wmflabs.org/browser/scr-repos.html

https://www.mediawiki.org/wiki/Gerrit/Navigation#Reports ha also been around for some time. Hopefully Korma will match that data and some point in order to have everything in one place.

PS: could this report be assigned to acs@bitergia in order to reflect the current situation?

Comment 8 Quim Gil 2013-11-15 16:51:47 UTC

Time to work on Gerrit review queue metrics.

See https://www.mediawiki.org/wiki/Community_metrics#Gerrit_review_queue

These are the questions we want to answer in this iteration:

How long is the Gerrit review queue over time? How long does it take to review code contributions? Are we improving? Are Wikimedia's staff and non-staff contributions processed equally? Who is doing better and worse?

* Number of Wikimedia staff / non-Wikimedia-staff commits reviewed in <2 days, <1 week, <1 month, <3 months, >3 months or unreviewed (values may be fine tuned based on actual data).
* Same as above, per project. Ranked from slowest to fastest.

How to calculate the average time? A few old open changes might distort the picture. What if we compare the pure average of the 100% of commits with the average removing the 10% of slowest and fastest?

One chart with extra data in a table? What kind of chart?

Comment 9 Alvaro 2013-11-15 21:36:44 UTC

Great! Next week time to start working on it!

Comment 10 Alvaro 2013-11-28 06:44:25 UTC

Ok, we have some results no. The gerrit backend for Bicho has being improved in order to get the MERGED and ABANDONED real event dates.

We have added some new vizs to the gerrit panel in korma dashboard:

http://korma.wmflabs.org/browser/scr.html

The last one is the accumulated pending reviews in time.

We need to find why so many reviews were merged in 2013-04. It seems that the merged date was not correct before this date. So probably, for this kind of analysis, we have only good data after 2013-04.

In order to get the tables for this data using just SQL:

select new-closed as increment, new, closed, year_new, month_new FROM ( select count(issue_id) as new, YEAR(changed_on) as year_new, MONTH(changed_on) as month_new FROM changes WHERE (new_value='new') GROUP BY YEAR(changed_on),MONTH(changed_on)) t, (select count(issue_id) as closed, YEAR(changed_on) as year_closed, MONTH(changed_on) as month_closed FROM changes WHERE (new_value='merged' OR new_value='abandoned') GROUP BY YEAR(changed_on),MONTH(changed_on)) t1 WHERE year_new = year_closed and month_new = month_closed;
+-----------+------+--------+----------+-----------+
| increment | new  | closed | year_new | month_new |
+-----------+------+--------+----------+-----------+
|       184 |  206 |     22 |     2011 |         9 |
|       339 |  367 |     28 |     2011 |        10 |
|       596 |  619 |     23 |     2011 |        11 |
|       495 |  543 |     48 |     2011 |        12 |
|       364 |  384 |     20 |     2012 |         1 |
|       661 |  699 |     38 |     2012 |         2 |
|       940 | 1058 |    118 |     2012 |         3 |
|      1795 | 1954 |    159 |     2012 |         4 |
|      2681 | 2812 |    131 |     2012 |         5 |
|      3036 | 3192 |    156 |     2012 |         6 |
|      2459 | 2611 |    152 |     2012 |         7 |
|      3519 | 3694 |    175 |     2012 |         8 |
|      2658 | 2797 |    139 |     2012 |         9 |
|      3888 | 4098 |    210 |     2012 |        10 |
|      3891 | 4121 |    230 |     2012 |        11 |
|      3778 | 3961 |    183 |     2012 |        12 |
|      3887 | 4101 |    214 |     2013 |         1 |
|      3202 | 3393 |    191 |     2013 |         2 |
|    -37432 | 4118 |  41550 |     2013 |         3 |
|       -21 | 3922 |   3943 |     2013 |         4 |
|       150 | 3846 |   3696 |     2013 |         5 |
|        22 | 3919 |   3897 |     2013 |         6 |
|        31 | 4321 |   4290 |     2013 |         7 |
|       103 | 3785 |   3682 |     2013 |         8 |
|       -12 | 3375 |   3387 |     2013 |         9 |
|        75 | 4394 |   4319 |     2013 |        10 |
|        70 | 3502 |   3432 |     2013 |        11 |
+-----------+------+--------+----------+-----------+

Is this the kind of information you are searching for?

The other important analysis is the backlog: the distribution in time of the pending reviews (when they were opened). We have already done that but not yet included in korma dashboard for gerrit.

We can build this analysis per repository, company and people!

Next goal is to work in the review time analysis.

Comment 11 Alvaro 2013-12-05 07:51:37 UTC

Added info about time to review to the gerrit panel in korma:

http://korma.wmflabs.org/browser/scr.html

The average time for time to review is: 3.7333

And you can see also the evolution in time.

The data is from 2013-05 to now because before this date, there are problems with the change date in gerrit data.

Comment 12 Quim Gil 2013-12-13 00:39:30 UTC

Summary of a chat with Alvaro reviewing https://www.mediawiki.org/wiki/Community_metrics#Gerrit_review_queue :

* How long is the Gerrit review queue over time? 

Answered an the "Pending" graph at http://korma.wmflabs.org/browser/scr.html --the data is wrong as reported in Bug 58349

* How long does it take to review code contributions? Are we improving?

Answered at the "Average time to review" at http://korma.wmflabs.org/browser/scr.html . The data might be also wrong based on Bug 58349. Also, currently the pure average is computed, but we want to calculate the median instead. This will help avoiding the distortion caused by superfast intra-team reviews and reviews stalled during months.

* Are Wikimedia's staff and non-staff contributions processed equally? 

Pending. We need to cross the "Average time to review" data with the affiliation data from http://korma.wmflabs.org/browser/contributors.html -- bug 58424

* Who is doing better and worse?

Pending. We need to display the "Average time to review" by repository. -- bug 58426

We didn't have time to discuss how to visualize more details about the review queues of each repository beyond the median -- bug 58428

Comment 13 Quim Gil 2013-12-23 21:20:07 UTC

As per bug 53485 comment 35 and 39, let's not count self-merges (reviews merged by the author of the patch) in any "average time to review" metrics.

Comment 14 Nemo 2014-02-07 08:29:13 UTC

From what above and bug 57038, it's not clear to me if you're currently relying on the database or the API. Anyway.

(In reply to comment #13)
> As per bug 53485 comment 35 and 39, let's not count self-merges (reviews
> merged
> by the author of the patch) in any "average time to review" metrics.

PleaseStand noted that we now have better searches by label: https://gerrit.wikimedia.org/r/Documentation/user-search.html#labels (2.7 was https://gerrit-documentation.storage.googleapis.com/Documentation/2.7/user-search.html ).

For instance.
* Core changes by non-WMF people with +2 by WMF people: <https://gerrit.wikimedia.org/r/#/q/-ownerin:wmf+label:Code-Review%253D2%252Cwmf+project:mediawiki/core,n,z> (with some approximation because the LDAP group is not complete).
* Commits by a user minus self-merges: <https://gerrit.wikimedia.org/r/#/q/owner:mark%2540wikimedia.org+label:Code-Review%253D2+-label:Code-Review%253D2%252Cmark%2540wikimedia.org,n,z>.
* Merges by a user minus self-merges: <https://gerrit.wikimedia.org/r/#/q/project:mediawiki/core+-owner:%22Siebrand+%253Csiebrand%2540wikimedia.org%253E%22+label:Code-Review%253D2%252Csiebrand%2540wikimedia.org,n,z> (restricted to core, gives zero results if more general probably because Siebrand has too many).

It doesn't always work reliably if you don't specify a project. Should work in the same way for API, e.g. <https://gerrit.wikimedia.org/r/changes/?q=-ownerin:wmf+label:Code-Review=2,wmf+project:mediawiki/core&n=25> (<https://gerrit.wikimedia.org/r/Documentation/rest-api-changes.html#list-changes>).

Comment 15 Quim Gil 2014-02-21 23:13:18 UTC

Status update (Nemo, I will leave your comment to Alvaro since I don't know better)

We have some graphs here:

http://korma.wmflabs.org/browser/gerrit_review_queue.html

"How long is the Gerrit review queue over time?" and "Backlog of pending reviews (when reviews were submitted)" could be put together, both starting when Gerrit was set up (march 2012?).

Maybe they could even be combined in a single graph? (optional)

Little known bug: all graphs should have Y axis starting with 0.

"How long does it take to review code contributions?" and "Are Wikimedia's staff and non-staff contributions processed equally?" also belong to each other. At the end the second question equals "How long does it take to review code contributions... from WMF, independents, WMDE, Wikia...?"

We are still discussing what to count exactly here. We started the discussion in our weekly meeting and we will continue here. The last assumption (which is not reflected in the graphs now) is:

For each measurement timespot, median of the ages of all reviews still open at
that timespot, understanding as "age" the time period since the review
was open to the measurement timespot.

Note that a review can have multiple patches. I'm proposing to count the age of the last patch instead of the age of creation of the review. What we are most interested in seeing is lack of activity, languishing reviews. Counting the age since creation is also useful, as secondary value (a second graph?)

"Who is doing better and worse?" should be sorted by median age of open reviews, starting by those repos with oldest patches awaiting review.

------------------------

In our discussion there were more ideas about data we could count. I'm documenting them here, although this might be relegated to a second iteration (since we have other KPIs that we want to complete in the following weeks):

* median time between uploading the changeset and receiving a first review

* average of iterations vs time since creation (some changesets may take long to review but they were nailed in a couple of revisions as soon as someone answered, while others required many iterations)

* normalization of data for big projects, calculating number of revisions within a period compared with the previous period.

Comment 16 Nemo 2014-02-23 14:14:23 UTC

The graph would use a set of legends or reference manual. For instance, I have no idea at all what "Reviews submitted" and "Reviews new" could mean.

Comment 17 Quim Gil 2014-02-24 16:28:09 UTC

Yes, labels and descriptions will be reviewed before closing this task. The "?" icons next to the tables contain some information.

Improvements to the text of the page can be proposed directly at https://github.com/Bitergia/mediawiki-dashboard/blob/master/browser/gerrit_review_queue.html (Alvaro, where are the "?" legends located?)

I agree "Reviews submitted" and "Reviews new" are unclear labels.

Comment 18 Quim Gil 2014-02-25 16:24:33 UTC

We have a new graph at http://korma.wmflabs.org/browser/gerrit_review_queue.html labeled

"How long does it take to review code contributions?" "For pending reviews" (final labels pending)

It calculates the average and median age of the changes opened at a given time. Currently that age is based on the date the changesets where initially uploaded, but I requested to calculate the age based on the date of the last patchset uploaded, in order to reflect more accurately the age of changesets that are stuck.

The list of repos at the end of the page will be sorted by this median age in the past month, in order to get at the top those repos that currently accummulate the oldest patchsets stuck.

Comment 19 Alvaro 2014-02-27 13:33:52 UTC

Update:

The list of repos is now sorted by the median age of the changes opened at a given time (month) and the evol in time of this metric is shown.

Also, for affiliations this metric has being also added.

Next step is to try the change: to calculate the age based on the date of the last patchset uploaded.

Comment 20 Alvaro 2014-02-27 15:53:39 UTC

About "to calculate the age based on the date of the last patchset uploaded" analyzing bicho gerrit database, the upload event is not stored, but the actions against patchsets (Code-Review, Verified). 

So we can use now the last "Code-Review" event in the review. And in a next iteration, improve gerrit backend for bicho so all upload changes are recorded.

All patchsets receives some kind of review so if we are not reviewing the time to review (right now the goal is to detect patchsets not reviewed, not time to review) I think this is a reasonable approach.

What do you think?

Comment 21 Alvaro 2014-02-27 16:38:24 UTC

(In reply to Alvaro from comment #20)
> About "to calculate the age based on the date of the last patchset uploaded"
> analyzing bicho gerrit database, the upload event is not stored, but the
> actions against patchsets (Code-Review, Verified). 
> 
> So we can use now the last "Code-Review" event in the review. And in a next
> iteration, improve gerrit backend for bicho so all upload changes are
> recorded.
> 
> All patchsets receives some kind of review so if we are not reviewing the
> time to review (right now the goal is to detect patchsets not reviewed, not
> time to review) I think this is a reasonable approach.
> 
> What do you think?

In order to understand better the expected results I have added another line to the pending accumulated time to show the time from last update metric. The time from last upload will be between these two lines.

http://korma.wmflabs.org/browser/gerrit_review_queue.html

Comment 22 Quim Gil 2014-02-27 16:45:53 UTC

VERY COOL! These news graphs are very useful.

Calculating "last code review" is not bad, and in fact it a useful measure per se. Jenkins reviews every new patchset within seconds, so at least we know that no patchset will be missed because of lack of reviews. Let's go with is, without removing the current graphs based on age of the first upload because they are also useful. This is good enough for now, and maybe all we need.

Proposed steps:

* in "Backlog of pending reviews", add the total amount of changesets submitted every month (optional, we can leave this for later)

* the first graph at "Are Wikimedia's staff..." can be removed; it is a bit confusing and the second one explains better what we are looking for

* in "Are Wikimedia's staff...", add "last code review median age at a given time"

* in "Who is doing better and worse?", add a line for "last code review median age at a given time", without removing review_time_pending_days_acc_median, and sort repos by "last code review median age at a given time"

* remember the fix to make all Y axis start with zero

* also please have one graph per row (except for the list of repos, of course

* and a small detail, currently the page as some horizontal scroll, and some graphs have no margin to the left, which I guess could be solved wrapping everything in a <div> width:95% or something.

After this I believe your part will be ready, and then we can work on better descriptions.

Comment 23 Nemo 2014-02-27 18:10:04 UTC

I note that all graphs are still completely meaningless because they still don't exclude self merges.

Babel (<http://korma.wmflabs.org/browser/repository.html?repository=gerrit.wikimedia.org_mediawiki_extensions_Babel>), which for some reason is ranked second in <http://korma.wmflabs.org/browser/gerrit_review_queue.html> despite having only one open change (about 6 months old, but -2'd), is depicted as some sort of heaven where in every single month the same amount of commits are opened and merged. The reality is that it's a very quiet extension with only 9 changes in the last year, the rest being localisation updates. <https://gerrit.wikimedia.org/r/#/q/project:mediawiki/extensions/Babel+-owner:l10n-bot,n,z>

At the very least, l10n-bot should have its own line in graphs, it doesn't make sense to mix it with "Independent" or "Unknown" whatever those lines in "Are we improving?" mean.

Comment 24 Quim Gil 2014-02-27 21:03:31 UTC

(In reply to Nemo from comment #23)
> I note that all graphs are still completely meaningless because they still
> don't exclude self merges.

Could you point to a Gerrit changeset with an example of a self merge, please? I'm still not sure I understand what you mean.

 
> Babel
> (<http://korma.wmflabs.org/browser/repository.html?repository=gerrit.
> wikimedia.org_mediawiki_extensions_Babel>), which for some reason is ranked
> second in <http://korma.wmflabs.org/browser/gerrit_review_queue.html>
> despite having only one open change (about 6 months old, but -2'd), is
> depicted as some sort of heaven where in every single month the same amount
> of commits are opened and merged. The reality is that it's a very quiet
> extension with only 9 changes in the last year, the rest being localisation
> updates.
> <https://gerrit.wikimedia.org/r/#/q/project:mediawiki/extensions/Babel+-
> owner:l10n-bot,n,z>

That list aims to be a ranking of shame. The projects at the top are the ones with a oldest median of unresolved changesets (now counted by date of first upload, soon by date of last review). 

Looking at https://gerrit.wikimedia.org/r/#/q/project:mediawiki/extensions/Babel+-owner:l10n-bot,n,z , the fact that Babel appears there shows that there might be a bug somewhere. That table shows a couple of changesets with "Label not applicable", but clicking themm one ca see that have been abandoned. Does Babel have any review open currently? Looks like not.

Alvaro, can you explain the results of Babel based on the data you have gathered from Gerrit?

 
> At the very least, l10n-bot should have its own line in graphs, it doesn't
> make sense to mix it with "Independent" or "Unknown" whatever those lines in
> "Are we improving?" mean.

We removed l10n-bot from http://korma.wmflabs.org/browser/who_contributes_code.html but I'm not sure whether it is being counted here.

Comment 25 Jesus M. Gonzalez-Barahona 2014-02-27 22:49:32 UTC

(In reply to Quim Gil from comment #24)
> > At the very least, l10n-bot should have its own line in graphs, it doesn't
> > make sense to mix it with "Independent" or "Unknown" whatever those lines in
> > "Are we improving?" mean.
> 
> We removed l10n-bot from
> http://korma.wmflabs.org/browser/who_contributes_code.html but I'm not sure
> whether it is being counted here.

It would be nice having a list of bots to be excluded from the calculus. From what I read in this ticket, it is clear that l10n-bot should be one of them. There are some more?

In adition, I understand that we should remove them from *all* the charts related to performance or time-to-x, not only from lists of persons. Am I right? I guess right now they are now only excluded from lists, but not from this kind of charts. Alvaro can for sure clarify this.

Comment 26 Jesus M. Gonzalez-Barahona 2014-02-27 22:51:14 UTC

(In reply to Quim Gil from comment #24)
> (In reply to Nemo from comment #23)
> > I note that all graphs are still completely meaningless because they still
> > don't exclude self merges.
> 
> Could you point to a Gerrit changeset with an example of a self merge,
> please? I'm still not sure I understand what you mean.

Nemo, are you referring to code review processes where the uploader of the changeset and the person accepting the merge is same person?

Comment 27 Nemo 2014-02-28 07:05:10 UTC

(In reply to Jesus M. Gonzalez-Barahona from comment #26)
> Nemo, are you referring to code review processes where the uploader of the
> changeset and the person accepting the merge is same person?

Yes. I provided queries to exclude them in comment 14.

Comment 28 Quim Gil 2014-02-28 07:21:29 UTC

Ok, I see:

https://gerrit.wikimedia.org/r/#/c/114180/

Owner	        Mark Bergsma
Uploaded	Feb 19, 2014 10:48 AM
Mark Bergsma	Patch Set 1: Code-Review+2	Feb 19 10:50 AM

Self-merge in 2 minutes...

Sorry Nemo if I sounded stupid or deaf. I *really* didn't know that merging your own patches was a) a practice in our project and even b) something that Gerrit would allow someone to do someone unless you buy "him" a server or something.  :)

So yes, if uploader and +2 is the same user, then that changeset should be ignored when calculating times to review. Thank you Nemo.

Comment 29 Jesus M. Gonzalez-Barahona 2014-02-28 07:34:41 UTC

Thanks to both of you. Definitely, we will look into that to fix our query.

Comment 30 Alvaro 2014-02-28 07:57:58 UTC

Hi guys, about the automerge and "instant reviews" we are already dealing with this in the last metrics:

https://github.com/VizGrimoire/VizGrimoireR/blob/mediawiki/vizgrimoire/SCR.py#L728

This is the basic query for getting the time to review and we are filtering:

 filters += " AND i.submitted_by<>changes.changed_by "

and also

 min_days_for_review = 0.042 # one hour

so in this metrics, no auto reviews and no <1h reviews are analyzed.

If you feel comfortable with that we can apply the same filtering to all data when getting closed submissions.

We will also filter out "i18n" submissions.

Opinions?

Comment 31 Nemo 2014-02-28 08:12:07 UTC

(In reply to Alvaro from comment #30)
> This is the basic query for getting the time to review and we are filtering:
> 
>  filters += " AND i.submitted_by<>changes.changed_by "

Can't review the query syntax but the principle is ok.

> 
> and also
> 
>  min_days_for_review = 0.042 # one hour

This I don't like. Speedy reviews are still reviews: sure, they may be mostly typo fixes or changes coming from pair programming or whatever, but it's still valid. Don't draw an arbitrary line, instead use appropriate percentiles. If the median is still too skewed, use the 75th percentile as the WMF is doing... I don't remember where.

> We will also filter out "i18n" submissions.

This is ok. Ideally you'd leave them in *one* graph i.e. the page on l10n-bot account itself, so that we have one place where to monitor its activity. But it's not necessary, I can also get such data from ohloh once I convince Nikerabbit to reply to its emails. :P

Comment 32 Quim Gil 2014-02-28 15:11:54 UTC

(In reply to Nemo from comment #31)
> (In reply to Alvaro from comment #30)
> >  min_days_for_review = 0.042 # one hour
> 
> This I don't like. Speedy reviews are still reviews

Agreed, speedy reviews happen when there are at least two developers working together fast. It is a factor to be considered in a project. It is also a factor that might increase the differences between WMF results and the rest, since it is easier to get two WMF employees in a same team working in such conditions that, say, and independent developer waiting for the review of someone with +2.

> If the median is still too skewed, use the 75th percentile as
> the WMF is doing... I don't remember where.

I think the pure median is good until someone comes with a better argument. The graphs should be good enough to show trends over time, and differences between affiliation and repositories.

 
> > We will also filter out "i18n" submissions.

One question, then. Do we need to filter explicitly changesets uploaded by bots, or will they be automatically filtered if/when they are merged by themselves? Or are the localization patches still merged manually?

But yes, in case of doubt let's remove bots from the equation. The purpose of these community metrics is to analyze the performance of humans.

Comment 33 Alvaro 2014-03-04 17:07:14 UTC

Data updated removing l10n and adding Speedy reviews:

http://korma.wmflabs.org/browser/gerrit_review_queue.html

Also the pending metric has being improved.

Comment 34 Quim Gil 2014-03-04 18:07:49 UTC

Thank you! Please try to address Comment #22 and "the paradox of Babel" :) explained in Comment #24 before our meeting tomorrow. If we want to have all KPIs solved this month, we need to close this one asap. Thank you!

Comment 35 Alvaro 2014-03-05 03:50:12 UTC

(In reply to Quim Gil from comment #34)
> Thank you! Please try to address Comment #22 and "the paradox of Babel" :)
> explained in Comment #24 before our meeting tomorrow. If we want to have all
> KPIs solved this month, we need to close this one asap. Thank you!

Ok, let's go with that. Filtering now l10bot the activity in babel has decreased a lot:

http://korma.wmflabs.org/browser/repository.html?repository=gerrit.wikimedia.org_mediawiki_extensions_Babel

26 total submitted, 23 merged, 2 abandoned and 1 pending.

In gerrit data there are 26 submissions:

https://gerrit.wikimedia.org/r/#/q/project:mediawiki/extensions/Babel+-owner:l10n-bot,n,z

So the data is correct.

About the high time to review, we are using the median, so if there are only one open review:

https://gerrit.wikimedia.org/r/#/c/62601/

from "May 7, 2013 3:52 PM", the median review time is this review time: 301 days. This is why it is the second worst project using this metric.

Cheers

Comment 36 Quim Gil 2014-03-05 04:56:33 UTC

Ah, ok, I guess I had confused that "-2" with rejected/abandoned. So these -2 are still open. Understood, thank you.

What I don't understand is the flat line that repositories like Babel have. If there is only one changeset and the median age at the end of February was 301 days, it cannot be 301 days a month earlier as well, right?

That changeset was uploaded on May 7, 2013, and if this was the only patch around I would expect a line starting with 24 days at the end of May, 54 at the end of June... until the current 301. Or am I missing something?

Comment 37 Alvaro 2014-03-05 05:48:28 UTC

(In reply to Quim Gil from comment #36)
> Ah, ok, I guess I had confused that "-2" with rejected/abandoned. So these
> -2 are still open. Understood, thank you.

Ok!

> 
> What I don't understand is the flat line that repositories like Babel have.
> If there is only one changeset and the median age at the end of February was
> 301 days, it cannot be 301 days a month earlier as well, right?

Let me see ... the flat line for Babel starts at May 2013 according to

http://korma.wmflabs.org/browser/gerrit_review_queue.html

And this is ok with the review opened at 7 May and already pending.

No more open reviews are pending, so each month the median review time for pending is the same, the review time for this pending review.

Remember that the review time is computed for all submissions using the NOW date. 

> 
> That changeset was uploaded on May 7, 2013, and if this was the only patch
> around I would expect a line starting with 24 days at the end of May, 54 at
> the end of June... until the current 301. Or am I missing something?

The way we are computing this metric is a bit different. 

What do you think? Hmmm, it make sense what your are expecting so maybe we need to rework this metric. The computing of the metric is more difficult using your approach.

Comment 38 Quim Gil 2014-03-05 06:52:10 UTC

(In reply to Alvaro from comment #37)
> Remember that the review time is computed for all submissions using the NOW
> date. 
> 
> > 
> > That changeset was uploaded on May 7, 2013, and if this was the only patch
> > around I would expect a line starting with 24 days at the end of May, 54 at
> > the end of June... until the current 301. Or am I missing something?
> 
> The way we are computing this metric is a bit different. 
> 
> What do you think? Hmmm, it make sense what your are expecting so maybe we
> need to rework this metric. The computing of the metric is more difficult
> using your approach.

Since comment #15 I'm assuming that the graphs showing age are being counted based on this rule:

"For each measurement timespot, median of the ages of all reviews still open at
that timespot, understanding as "age" the time period since the review
was open to the measurement timespot."

Which is exactly the way I'm counting the example of Babel above. 

Isn't this the rule used already to draw "Review time for open submissions: Upload and Updated times (median)"?

Comment 39 Alvaro 2014-03-05 07:04:16 UTC

Quim, metric logic fixed following the above definition.

You have the new data and viz in korma:

http://korma.wmflabs.org/browser/gerrit_review_queue.html

Ups, I am sorry about the first version of the metric. It was showing incorrect data. Now the "Review time for pending in days" in much more real ;)

Comment 40 Alvaro 2014-03-05 07:06:23 UTC

(In reply to Alvaro from comment #39)
> Quim, metric logic fixed following the above definition.
> 
> You have the new data and viz in korma:
> 
> http://korma.wmflabs.org/browser/gerrit_review_queue.html
> 
> Ups, I am sorry about the first version of the metric. It was showing
> incorrect data. Now the "Review time for pending in days" is much more real
> ;)

still open at that timespot ... the timespot is the last day of each month.

https://github.com/VizGrimoire/VizGrimoireR/commit/a400e000663a65aa001f828375c7f0ce84074074#diff-830de9be97e5b0e8825c98e7be544f04R854

Comment 41 Quim Gil 2014-03-05 23:33:12 UTC

What remains to close this report:

(In reply to Quim Gil from comment #22)
> * the first graph at "Are Wikimedia's staff..." can be removed; it is a bit
> confusing and the second one explains better what we are looking for
 
> * in "Are Wikimedia's staff...", add "last code review median age at a given
> time"
 
> * in "Who is doing better and worse?", add a line for "last code review
> median age at a given time", without removing
> review_time_pending_days_acc_median, and sort repos by "last code review
> median age at a given time"
 
> * remember the fix to make all Y axis start with zero

"Number of pending submissions" still needs to be fixed.
 
> * also please have one graph per row (except for the list of repos, of course

> * and a small detail, currently the page as some horizontal scroll, and some
> graphs have no margin to the left, which I guess could be solved wrapping
> everything in a <div> width:95% or something.

Still happening, just shrink the window a bit to see what I mean.


We agreed to refresh this page at the beginning of every month. This will also solve the problem of the graphs showing confusing results in the current month.

We left this task as optional:

> * in "Backlog of pending reviews", add the total amount of changesets
> submitted every month (optional, we can leave this for later)

Comment 42 Quim Gil 2014-03-26 17:38:32 UTC

The graphs are now considered ready:

http://korma.wmflabs.org/browser/gerrit_review_queue.html

We are still missing better strings, but this is a task for me. I'm taking the bug. Help / patches welcome.

https://www.mediawiki.org/wiki/Community_metrics#Global_Pending_List still mentions some nice-to-have improvements:

* Add median age of pending reviews by submission date in the list of repos (now it lists median age of last patches)

* Add columns of total number of submissions in Pending backlog

* Add Organizations graph with  median age of pending reviews by submission date

If these items are still open when we close this bug, we will open new bug reports for them.

Comment 43 Quim Gil 2014-07-15 13:15:12 UTC

Finally, I have started providing better strings in the form of pull requests:

https://github.com/Bitergia/mediawiki-dashboard/pulls

In the meantime http://korma.wmflabs.org/browser/gerrit_review_queue.html has become a very useful page, providing that you understand the graphs.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links