Last modified: 2012-06-11 20:31:08 UTC
Seen on job-runner03 : Main loop: /bin/bash /usr/local/apache/common/php/extensions/WikimediaMaintenance/jobs-loop.sh A child: \_ php MWScript.php runJobs.php --wiki=The MediaWiki script file "./php-trunk/maintenance/nextJobDB.php" does not exist. --
$ ll /usr/local/apache/ ls: /usr/local/apache/: Input/output error Sounds bad ;-D
Looking at the process information for jobs-loop.sh , I found out that the `cwd` pointed to a deleted path: $ ls -l /proc/1234/cwd lrwxrwxrwx 1 apache apache 0 2012-05-25 08:16 cwd -> /usr/local/apache/common-local/multiversion (deleted) Although the directory is actually there :-(
Restarting loop ( /etc/init.d/mw-job-runner ), seems to fix the link: # ls -l /proc/6973/cwd lrwxrwxrwx 1 apache apache 0 2012-05-25 08:24 /proc/6973/cwd -> /usr/local/apache/common-local/multiversion/ # /usr/local/apache being a NFS mount : deployment-nfs-memc:/mnt/export/apache on /usr/local/apache type nfs (rw,bg,soft,tcp,timeo=14,intr,nfsvers=3,addr=10.4.0.58) I have no idea what could make it unliked. Maybe the NFS server move the directory somehow or whenever NFS has a connection issue the jobrunner servers considers the file unaccessible permanently. I am marking 36646 - "get rid of NFS" as a dependency.
Are you sure you haven't deleted and recreated the directory since the process was started? If yes & it happens again, don't restart the process and notify me, I'd like to have a look.
Lowering priority, I have not seen that occurrence I guess. Most probably someone renamed, altered the path. I guess we can close the bug if it does not occur anymore over then next week or so.
Was some transient issue I have not reproduced seen reproduced so far. So I am just closing this bug and will reopen it later on if it occurs again.