Last modified: 2013-04-21 21:07:42 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T49479, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 47479 - GlusterFS deployment-prep-project has too many errors -> beta cluster is down
GlusterFS deployment-prep-project has too many errors -> beta cluster is down
Status: RESOLVED FIXED
Product: Wikimedia Labs
Classification: Unclassified
Infrastructure (Other open bugs)
unspecified
All All
: Unprioritized normal
: ---
Assigned To: Ryan Lane
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-21 19:39 UTC by Chris McMahon
Modified: 2013-04-21 21:07 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Some errors logs from /var/log/glusterfs/data-project.log (5.31 KB, text/plain)
2013-04-21 20:44 UTC, Antoine "hashar" Musso (WMF)
Details
error accessing one file (1.12 KB, text/plain)
2013-04-21 21:01 UTC, Antoine "hashar" Musso (WMF)
Details

Description Chris McMahon 2013-04-21 19:39:39 UTC
ERROR

The requested URL could not be retrieved

While trying to retrieve the URL: http://en.wikipedia.beta.wmflabs.org/

The following error was encountered:

Unable to forward this request at this time.
This request could not be forwarded to the origin server or to any parent caches. The most likely cause for this error is that:

The cache administrator does not allow this cache to make direct connections to origin servers, and
All configured parent caches are currently unreachable.
Your cache administrator is benapetr@gmail.com. 
Generated Sun, 21 Apr 2013 19:35:03 GMT by squid001.beta.wmflabs.org (squid/2.7.STABLE9)
Comment 1 Antoine "hashar" Musso (WMF) 2013-04-21 20:33:37 UTC
Cant ssh to either apache32 or apache33 instances although the servers ping.  Port 80 does not answer.
Comment 2 Antoine "hashar" Musso (WMF) 2013-04-21 20:42:54 UTC
I have rebooted apache32. Restarting apache I got:

# /etc/init.d/apache2 start
/etc/init.d/apache2: 55: [: nice: unexpected operator
 * Starting web server apache2                                                  Warning: DocumentRoot [/usr/local/apache/common/docroot/wikispecies.org] does not exist
Warning: DocumentRoot [/usr/local/apache/common/docroot/config] does not exist
Warning: DocumentRoot [/usr/local/apache/common/docroot/ee-prototype] does not exist
(5)Input/output error: apache2: could not open error log file /home/wikipedia/logs/apache-error.log.
Unable to open logs
Action 'start' failed.

So this is caused by the Gluster volume having an issue of some sort.


GlusterFS deployment-prep-project has too many errors.
Comment 3 Antoine "hashar" Musso (WMF) 2013-04-21 20:44:34 UTC
Created attachment 12155 [details]
Some errors logs from /var/log/glusterfs/data-project.log

Some Gluster errors on deployment-apache32 instance showing that the deployment-prep-project volume has some issues.  The files under /logs/ have conflicting entries.
Comment 4 Antoine "hashar" Musso (WMF) 2013-04-21 21:00:39 UTC
Whenever the /logs files are fixed, one would have to restart apache2 service on deployment-apache32 and deployment-apache33.  Puppet might take care of it though :)
Comment 5 Antoine "hashar" Musso (WMF) 2013-04-21 21:01:13 UTC
Created attachment 12156 [details]
error accessing one file
Comment 6 Andrew Bogott 2013-04-21 21:05:40 UTC
Suffering a lack of curiosity, I just deleted the affected logfiles on labstore1 and labstore2.
Comment 7 Antoine "hashar" Musso (WMF) 2013-04-21 21:07:42 UTC
I have restarted apache2 on both instances. Solved :-]   Will have to reboot the other instances tomorrow.  At least beta is up again, thank you Andrew!!!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links