Last modified: 2013-11-12 15:17:50 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57948, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55948 - git replication to antimony/gallium/github broken
git replication to antimony/gallium/github broken
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Git/Gerrit (Other open bugs)
wmf-deployment
All All
: Immediate blocker (vote)
: ---
Assigned To: Chad H.
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-20 21:14 UTC by Kunal Mehta (Legoktm)
Modified: 2013-11-12 15:17 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Kunal Mehta (Legoktm) 2013-10-20 21:14:51 UTC
https://gerrit.wikimedia.org/r/#/c/90739/ was merged, but isn't showing up in https://git.wikimedia.org/summary/mediawiki%2Fextensions%2FMassMessage.git nor https://github.com/wikimedia/mediawiki-extensions-MassMessage/commits/master

Also, https://git.wikimedia.org/ says "there has been no activity today" (false), and the active repositories sidebar is empty
Comment 1 Chad H. 2013-10-21 00:53:53 UTC
Hmm, no problems replicating to lanthanum, just everything else :\

We're getting rejected host key errors in the logs. All the broken sites have valid fingerprints in known_hosts, and ssh'ing manually to the boxes works fine.
Comment 2 Antoine "hashar" Musso (WMF) 2013-10-21 11:19:30 UTC
A side effect in Jenkins is that we use the locate replication for extensions jobs.  That is used to installed mediawiki/core@master as well as potential extensions dependencies.  Can lead to some crazy build failures.
Comment 3 Antoine "hashar" Musso (WMF) 2013-10-21 11:38:16 UTC
On gallium auth log, I see rejected connection from ytterbium.wikimedia.org [208.80.154.80] since Oct 19 20:55 UTC

The last one working:


Oct 19 20:46:27 
Set /proc/self/oom_score_adj to 0
Connection from 208.80.154.80 port 44711
Found matching RSA key: ///
Postponed publickey for gerritslave from 208.80.154.80 port 44711 ssh2 [preauth]
Found matching RSA key: ///
Accepted publickey for gerritslave from 208.80.154.80 port 44711 ssh2
pam_unix(sshd:session): session opened for user gerritslave by (uid=0)
User child is on pid 30532
pam_unix(sshd:session): session closed for user gerritslave

The first one failing:

Oct 19 20:56:30 
Connection from 208.80.154.80 port 45384
Received disconnect from 208.80.154.80: 3: com.jcraft.jsch.JSchException: reject HostKey: gallium.wikimedia.org [preauth]

Rest of the auth log is filled with such errors.
Comment 4 Antoine "hashar" Musso (WMF) 2013-10-21 11:41:54 UTC
October 19th:

20:54 ^d: gerrit: installed 2.7-rc2-507-g1e7090b, service back up

Seems the upgrade did not went well and broke something.  Maybe replication is run by a different username that does not has gallium.wikimedia.org added to known_hosts.
Comment 5 Antoine "hashar" Musso (WMF) 2013-10-21 12:11:39 UTC
The same issue appear on lanthanum.eqiad.wmnet  and might be happening on antimony.wikimedia.org as well.
Comment 6 Chad H. 2013-10-21 14:24:02 UTC
(In reply to comment #4)
> October 19th:
> 
> 20:54 ^d: gerrit: installed 2.7-rc2-507-g1e7090b, service back up
> 
> Seems the upgrade did not went well and broke something.  Maybe replication
> is
> run by a different username that does not has gallium.wikimedia.org added to
> known_hosts.

Upgrade didn't touch replication, it only added a minor change to the output format of `gerrit query.`

gerrit has always read /var/lib/gerrit2/.ssh/known_hosts, which hasn't changed since the move to ytterbium.

(In reply to comment #5)
> The same issue appear on lanthanum.eqiad.wmnet  and might be happening on
> antimony.wikimedia.org as well.

lanthanum is replicating fine, it's antimony/gallium/github that are funky like I mentioned above.
Comment 7 Alexandros Kosiaris 2013-10-21 14:29:18 UTC
This turned out to be an installation issue. For some reason gerrit user's homedir was at /home/gerrit2 instead of /var/lib/gerrit2. For now i just copied the files and restarted gerrit2, but I will fix it cleanly, moving the homedir in /var/lib/gerrit2 and deleting /home/gerrit2
Comment 8 Chad H. 2013-10-21 14:41:44 UTC
Bah, this is my fault. I'll clean it up.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links