Last modified: 2014-04-09 15:55:03 UTC
One data point used by the Enginnering Community team to measure the involvement of non-WMF code contributors is to look at how many users with +2 permissions we have in Gerrit, and which is their affiliation. Is there a way to retrieve this data automatically on a quarterly basis? Or even a way to check manually who has +2 right now?
I do not think there is a way to easily find out who has +2 on projects. --- totally unfeasible approach below --- If you absolutely have to find out, you can do by hand by going through the projects settings (and parent projects recursively), resolve the ACLs by hand, and resolve the groups with +2 permission to contributors. But that's a huge amount of work. And note that while resolving the groups, gerrit is hiding groups that you have no permission to see. So you might miss groups, if you haven't got sufficient permissions. And note that the ls-members (even with --recursive) does not expand ldap groups.
Isn't https://gerrit.wikimedia.org/r/#/admin/groups/11,members the (glibal) list of folks with +2 in Gerrit, or am I wrong?
Maybe a first approach could be to list the people that have used the +2 rights. With the actual data gathered from gerrit I think it should be too difficult. Let me try: mysql -u root acs_gerrit_mediawiki_2428 mysql> select count(distinct(changed_by)) from changes where new_value='2'; +-----------------------------+ | count(distinct(changed_by)) | +-----------------------------+ | 149 | +-----------------------------+ 149 people have used the +2 rights. We have join with upeople_companies table to search for its affiliations. mysql> select count(distinct(changes.id)) as total, changed_by, people.name, company_id, companies.name from changes, people_upeople, people, acs_cvsanaly_mediawiki_2428.upeople_companies, acs_cvsanaly_mediawiki_2428.companies where people.id = people_upeople.people_id and changes.changed_by=people_upeople.people_id and upeople_companies.upeople_id = people_upeople.upeople_id and upeople_companies.company_id = companies.id and new_value='2' group by changed_by order by total desc limit 20; +-------+------------+---------------------+------------+-----------------------+ | total | changed_by | name | company_id | name | +-------+------------+---------------------+------------+-----------------------+ | 38543 | 2 | jenkins-bot | 172 | Unknown | | 28142 | 29 | L10n-bot | 3051 | translatewiki.net | | 4765 | 15 | Reedy | 3053 | Wikimedia Foundation | | 2862 | 69 | Siebrand | 3053 | Wikimedia Foundation | | 2530 | 101 | Mark Bergsma | 3053 | Wikimedia Foundation | | 2439 | 27 | Hashar | 3053 | Wikimedia Foundation | | 2159 | 25 | Catrope | 3053 | Wikimedia Foundation | | 1869 | 61 | Ryan Lane | 3053 | Wikimedia Foundation | | 1868 | 35 | Aaron Schulz | 3053 | Wikimedia Foundation | | 1862 | 120 | Jeroen De Dauw | 3052 | Wikimedia Deutschland | | 1567 | 70 | GWicke | 3053 | Wikimedia Foundation | | 1469 | 10 | Ori.livneh | 3053 | Wikimedia Foundation | | 1448 | 156 | Tobias Gritschacher | 3052 | Wikimedia Deutschland | | 1360 | 20 | MaxSem | 3053 | Wikimedia Foundation | | 1356 | 299 | Ottomata | 3053 | Wikimedia Foundation | | 1239 | 353 | Dzahn | 3053 | Wikimedia Foundation | | 1235 | 54 | Tim Starling | 3053 | Wikimedia Foundation | | 1201 | 9 | Nikerabbit | 3053 | Wikimedia Foundation | | 1049 | 12 | Krinkle | 3053 | Wikimedia Foundation | | 1048 | 167 | awjrichards | 3053 | Wikimedia Foundation | +-------+------------+---------------------+------------+-----------------------+ With this query you have the top20 +2 reviewers with the org. If we want to exclude Wikimedia: mysql> select count(distinct(changes.id)) as total, changed_by, people.name, company_id, companies.name from changes, people_upeople, people, acs_cvsanaly_mediawiki_2428.upeople_companies, acs_cvsanaly_mediawiki_2428.companies where people.id = people_upeople.people_id and changes.changed_by=people_upeople.people_id and upeople_companies.upeople_id = people_upeople.upeople_id and upeople_companies.company_id = companies.id and new_value='2' and companies.name not like '%Wikimedia%' group by changed_by order by total desc limit 20; +-------+------------+------------------+------------+-------------------+ | total | changed_by | name | company_id | name | +-------+------------+------------------+------------+-------------------+ | 38543 | 2 | jenkins-bot | 172 | Unknown | | 28142 | 29 | L10n-bot | 3051 | translatewiki.net | | 515 | 36 | IAlex | 172 | Unknown | | 468 | 170 | Mwjames | 3054 | Independent | | 466 | 94 | Santhosh | 172 | Unknown | | 293 | 31 | Hoo man | 3054 | Independent | | 264 | 405 | Xqt | 3054 | Independent | | 233 | 7 | Matmarex | 3054 | Independent | | 199 | 83 | Raimond Spekking | 3054 | Independent | | 196 | 66 | Addshore | 3054 | Independent | | 191 | 28 | Alex Monk | 3054 | Independent | | 183 | 242 | Yaron Koren | 3055 | WikiWorks | | 178 | 171 | Pgehres | 3054 | Independent | | 148 | 26 | Parent5446 | 3054 | Independent | | 146 | 50 | Legoktm | 3054 | Independent | | 117 | 132 | Tpt | 3054 | Independent | | 109 | 293 | Merlijn van Deen | 3054 | Independent | | 108 | 91 | Petr Onderka | 172 | Unknown | | 106 | 22 | TheDJ | 3054 | Independent | | 93 | 109 | Ladsgroup | 172 | Unknown | +-------+------------+------------------+------------+-------------------+ 20 rows in set (0.41 sec) We can create querterly reports also from this core SQL query. Something like that Quim?
(In reply to comment #2) > Isn't https://gerrit.wikimedia.org/r/#/admin/groups/11,members the (glibal) > list of folks with +2 in Gerrit [...] For projects underneath mediawiki/* group #11 (mediawiki) should have +2. But they are not the only one, and they do not have +2 on all branches (see refs/heads/wmf/* section of [1]). There mediawiki only has +1 and for example wmf-deployment and ldap/ops have +2. And looking also outside of mediawiki/* ... All-Projects (the root of the hierarchy [2]) comes with +2 for “Project Owner”. “Project Owner” is (typically) different for each repository. And some projects again override this. That's only the tip of the iceberg :-/ [1] https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki,access [2] https://gerrit.wikimedia.org/r/#/admin/projects/All-Projects,access
Alvaro, I think your approach is very good. *Exercising* +2 rights is what matters about *having* +2 rights. Useful data that we will use *tomorrow* at the Enginnering Community team quarterly review, if available: (The scope is July - September 2013. All the better if we can have results of the previous three quarters in order to have a one-year view) * Total amount of users that exercised +2 rights. * Full list of users with affiliation. This will be also useful to identify more "unknown". For instance: jenkins-bot: shouldn't be counted IAlex: independent Santosh: WMF Petr Onderka: independent (GSoC student, yay!) Ladsgroup: independent We might still fall in situation where two email addresses belong to a same person, or Contributor X was independent a year ago and now works at the WMF, but exposing this data will help us fine tuning user data. Getting active +2 contributors right is a good goal per se. Setting priority accordingly. It is ok to let other tasks aside in order to get this data asap. Thank you very much for the flexibility!
Sorry for the bugmailspam.
Quim, i need to review the companies mapping because there are some probs in the data. Once I debug it, I will provide the data you need. I am sorry not to have the data available for today!
Ok Quim, finally I have found time for gathering the results. From Jul to Sep 2013: * 34 people with +2 actions not from WMF * The complete list below mysql> select count(distinct(changes.id)) as total, changed_by, people_upeople.upeople_id, people.name, company_id, companies.name from changes, people_upeople, people, acs_cvsanaly_mediawiki_2428.upeople_companies, acs_cvsanaly_mediawiki_2428.companies where people.id = people_upeople.people_id and changes.changed_by=people_upeople.people_id and upeople_companies.upeople_id = people_upeople.upeople_id and upeople_companies.company_id = companies.id and new_value='2' and changed_on > '2013-06' and changed_on < '2013-10' and companies.name <>'Wikimedia Deutschland' and companies.name<>'Wikimedia Foundation'group by changed_by order by total desc; +-------+------------+------------+------------------+------------+-------------------+ | total | changed_by | upeople_id | name | company_id | name | +-------+------------+------------+------------------+------------+-------------------+ | 16951 | 2 | 164 | jenkins-bot | 172 | Unknown | | 7972 | 29 | 171 | L10n-bot | 3051 | translatewiki.net | | 161 | 170 | 293 | Mwjames | 3054 | Independent | | 137 | 7 | 256 | Matmarex | 3054 | Independent | | 125 | 66 | 791 | Addshore | 3054 | Independent | | 124 | 405 | 151 | Xqt | 3054 | Independent | | 119 | 50 | 161 | Legoktm | 3054 | Independent | | 108 | 91 | 719 | Petr Onderka | 172 | Unknown | | 93 | 31 | 288 | Hoo man | 577 | cox | | 80 | 26 | 389 | Parent5446 | 3054 | Independent | | 78 | 109 | 169 | Ladsgroup | 172 | Unknown | | 78 | 293 | 128 | Merlijn van Deen | 3054 | Independent | | 59 | 94 | 529 | Santhosh | 172 | Unknown | | 39 | 389 | 97 | DrTrigon | 172 | Unknown | | 39 | 22 | 58 | TheDJ | 3054 | Independent | | 35 | 132 | 61 | Tpt | 3054 | Independent | | 32 | 122 | 783 | Physikerwelt | 172 | Unknown | | 31 | 83 | 341 | Raimond Spekking | 3054 | Independent | | 29 | 21 | 417 | Brian Wolff | 3054 | Independent | | 26 | 242 | 554 | Yaron Koren | 3055 | WikiWorks | | 20 | 159 | 537 | KartikMistry | 172 | Unknown | | 19 | 28 | 261 | Alex Monk | 3054 | Independent | | 14 | 232 | 747 | Foxtrott | 172 | Unknown | | 14 | 36 | 363 | IAlex | 172 | Unknown | | 10 | 365 | 196 | Stefan.petrea | 3054 | Independent | | 6 | 49 | 770 | Wikinaut | 172 | Unknown | | 3 | 330 | 866 | Christian | 1303 | wikia | | 3 | 81 | 551 | Nischayn22 | 3054 | Independent | | 2 | 424 | 1200 | Kelson | 2169 | btinternet | | 2 | 140 | 785 | Inez | 1303 | wikia | | 2 | 408 | 115 | Russell Blau | 98 | googlemail | | 1 | 52 | 351 | Liangent | 3054 | Independent | | 1 | 199 | 345 | SPQRobin | 3054 | Independent | | 1 | 235 | 98 | QChris | 3054 | Independent | +-------+------------+------------+------------------+------------+-------------------+ 34 rows in set, 2 warnings (0.63 sec) I hope it is useful for you!
(In reply to comment #8) > | 59 | 94 | 529 | Santhosh | 172 | Unknown might work for WMF if it's https://wikimediafoundation.org/wiki/User:Sthottingal > | 20 | 159 | 537 | KartikMistry | 172 | Unknown works for WMF: https://wikimediafoundation.org/wiki/User:KMistry_%28WMF%29 > | 1 | 235 | 98 | QChris | 3054 | Independent works for WMF as contractor so depends on how you interpret it; http://lists.wikimedia.org/pipermail/wikitech-l/2012-February/058294.html > | 29 | 21 | 417 | Brian Wolff | 3054 | > | 3 | 81 | 551 | Nischayn22 | 3054 | FYI, have worked for WMF in the past
Ok, thank you very much! With this I have enough for today. Enjoy your weekend! This task is also useful to resolve the Gerrit review queue key performance indicator: https://www.mediawiki.org/wiki/Community_metrics#Gerrit_review_queue I think the best approach for the unclear cases is to ask them to fill The Google Form where thy can provide details such as the affiliation. The "cox", "btinternet", and "googlemail" entries can be reassigned as "Independent". Also, Wikimedia Germany seems to be missing here, since they have +2 developers but they are not WMF.
With this SQL query you can get the information for all quarters: SELECT COUNT(DISTINCT(changes.id)) AS total, QUARTER(changed_on) as quarter, YEAR(changed_on) year, changed_by, people_upeople.upeople_id, people.name, company_id, companies.name FROM changes, people_upeople, people, acs_cvsanaly_mediawiki_2428.upeople_companies, acs_cvsanaly_mediawiki_2428.companies WHERE people.id = people_upeople.people_id AND changes.changed_by=people_upeople.people_id AND upeople_companies.upeople_id = people_upeople.upeople_id AND upeople_companies.company_id = companies.id AND new_value='2' AND companies.name <>'Wikimedia Deutschland' and companies.name<>'Wikimedia Foundation' GROUP BY QUARTER(changed_on), YEAR(changed_on), changed_by ORDER BY year, quarter, total DESC; +-------+---------+------+------------+------------+---------------------+------------+-------------------+ | total | quarter | year | changed_by | upeople_id | name | company_id | name | +-------+---------+------+------------+------------+---------------------+------------+-------------------+ | 13838 | 1 | 2013 | 29 | 171 | L10n-bot | 3051 | translatewiki.net | | 6346 | 1 | 2013 | 2 | 164 | jenkins-bot | 172 | Unknown | | 886 | 1 | 2013 | 385 | 343 | Pyoungmeister | 172 | Unknown | | 801 | 1 | 2013 | 54 | 236 | Tim Starling | 172 | Unknown | | 759 | 1 | 2013 | 156 | 505 | Tobias Gritschacher | 172 | Unknown | | 667 | 1 | 2013 | 6 | 340 | Kaldari | 172 | Unknown | .... | 1 | 3 | 2013 | 330 | 866 | Christian | 172 | Unknown | | 9011 | 4 | 2013 | 2 | 164 | jenkins-bot | 172 | Unknown | | 3226 | 4 | 2013 | 29 | 171 | L10n-bot | 3051 | translatewiki.net | | 239 | 4 | 2013 | 299 | 187 | Ottomata | 172 | Unknown | | 176 | 4 | 2013 | 405 | 151 | Xqt | 3054 | Independent | | 135 | 4 | 2013 | 20 | 92 | MaxSem | 172 | Unknown | | 123 | 4 | 2013 | 241 | 216 | Faidon Liambotis | 172 | Unknown | | 115 | 4 | 2013 | 65 | 10 | Mwalker | 172 | Unknown | | 108 | 4 | 2013 | 7 | 256 | Matmarex | 3054 | Independent | | 95 | 4 | 2013 | 3 | 89 | Jdlrobson | 172 | Unknown | ... 185 rows in set (0.48 sec) As you said it is a pretty good query to debug companies mapping!
(In reply to comment #8) > Ok Quim, finally I have found time for gathering the results. > > From Jul to Sep 2013: > > * 34 people with +2 actions not from WMF > * The complete list below > > mysql> select count(distinct(changes.id)) as total, changed_by, > people_upeople.upeople_id, people.name, company_id, companies.name from > changes, people_upeople, people, > acs_cvsanaly_mediawiki_2428.upeople_companies, > acs_cvsanaly_mediawiki_2428.companies where people.id = > people_upeople.people_id and changes.changed_by=people_upeople.people_id and > upeople_companies.upeople_id = people_upeople.upeople_id and > upeople_companies.company_id = companies.id and new_value='2' and changed_on > > > '2013-06' and changed_on < '2013-10' and companies.name <>'Wikimedia > Deutschland' and companies.name<>'Wikimedia Foundation'group by changed_by > order by total desc; Guys, this query was not correct because "changed_on > '2013-06' and changed_on < '2013-10'". This should be "changed_on > '2013-06-31' and changed_on < '2013-10-01'". It is better for quarter analysis to use: "QUARTER(changed_on) = 3 and YEAR(changed_on) = '2013'"
(In reply to comment #12) > (In reply to comment #8) > > Ok Quim, finally I have found time for gathering the results. > > > > From Jul to Sep 2013: > > > > * 34 people with +2 actions not from WMF With the correct query this number is: * 49 people with +2 actions not from WMF in 2013 Q3. I am sorry about the noise! > > * The complete list below > > > > mysql> select count(distinct(changes.id)) as total, changed_by, > > people_upeople.upeople_id, people.name, company_id, companies.name from > > changes, people_upeople, people, > > acs_cvsanaly_mediawiki_2428.upeople_companies, > > acs_cvsanaly_mediawiki_2428.companies where people.id = > > people_upeople.people_id and changes.changed_by=people_upeople.people_id and > > upeople_companies.upeople_id = people_upeople.upeople_id and > > upeople_companies.company_id = companies.id and new_value='2' and changed_on > > > > > '2013-06' and changed_on < '2013-10' and companies.name <>'Wikimedia > > Deutschland' and companies.name<>'Wikimedia Foundation'group by changed_by > > order by total desc; > > Guys, this query was not correct because "changed_on > '2013-06' and > changed_on > < '2013-10'". This should be "changed_on > '2013-06-31' and changed_on < > '2013-10-01'". > > It is better for quarter analysis to use: > > "QUARTER(changed_on) = 3 and YEAR(changed_on) = '2013'"
Note that self-merges must not be counted as +2 actions, see bug 53485 comment 35 and 39 for references.
Ok, we need to find a way to get this data in http://korma.wmflabs.org/browser/gerrit_review_queue.html What about a table like "Reviews merged"at http://korma.wmflabs.org/browser/who_contributes_code.html but counting the amount of people exercising +2 every month, by organization?
Let's change this to low until we really know what we need. It is unclear now. Ideas welcome. Maybe a way to approach this task is to think what do we want to report in the ECT quarterly reviews.