Last modified: 2013-07-26 05:17:51 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T52347, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 50347 - mw-update-l10n failure does not abort scap
mw-update-l10n failure does not abort scap
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
wmf-deployment
All All
: High major (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-06-28 07:11 UTC by Ori Livneh
Modified: 2013-07-26 05:17 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ori Livneh 2013-06-28 07:11:26 UTC
mw-update-l10n wraps the l10n cache update in parentheses, causing it to run in a sub-shell.

The first problem is that 'set -e' doesn't work predictably in sub-shells. Its precise behavior is documented here: <http://fvue.nl/wiki/Bash:_Error_handling#Caveat_1:_.60Exit_on_error.27_ignoring_subshell_exit_status>.

The second problem is that the exit status of the sub-shell is ||'d with a call to 'echo' to emit an error message. Because echo typically returns success, the exit status of mw-update-l10n is always success, and so l10n failures do not reach scap.

A good solution is for the parent shell to explicitly check the exit status of any important command it invokes.

This pattern works well:

some_command || {
  st="$?"
  echo "some_command failed"
  exit $st
}

It can be neatly encapsulated in a bash function, as shown in <http://mywiki.wooledge.org/BashFAQ/101>.
Comment 1 Ori Livneh 2013-06-28 07:40:49 UTC
A further consequence of the misplaced confidence in 'set -e' is that scap will sync an empty l10n file if mergeMessageFileList.php fails for whatever reason.

The problem is in lines 42-46 of mw-update-l10n:

  mwTempDest=$(sudo -u apache mktemp)
  sudo -u apache $BINDIR/mwscript mergeMessageFileList.php --wiki="$mwDbName" \
    --list-file=$MW_COMMON_SOURCE/wmf-config/extension-list $QUIET --output="$mwTempDest"
  sudo -u apache chmod 664 "$mwTempDest"
  cp "$mwTempDest" $MW_COMMON_SOURCE/wmf-config/ExtensionMessages-"$mwVerNum".php

An empty file is created by 'mktemp'. Because we're executing in a subshell, if mergeMessageFileList.php fails, the script continues to execute, and control eventually flows to 'cp' on line 46, which copies the temp file to ExtensionMessages-X.XXwmfX.php. The empty message file is then synced.
Comment 2 Aude 2013-06-28 11:57:35 UTC
in case of WikibaseDataModel, we can try explicitly including it:

https://gerrit.wikimedia.org/r/#/c/71004/

I don't know if this really helps but seems to work better when i try mergeMessageFileList.php

if there is anything else we should do differently, please let us know!
Comment 3 Gerrit Notification Bot 2013-06-28 14:08:47 UTC
Change 71056 had a related patch set uploaded by Aude:
(bug 50347) have mergeMessageFileList check if extension files in file-list are available

https://gerrit.wikimedia.org/r/71056
Comment 4 Aude 2013-06-28 14:12:14 UTC
among other issues, I think we should not rely on include_once and have the mergeMessageFileList check for and omit any missing extension files.

see https://gerrit.wikimedia.org/r/71056
Comment 5 Tim Starling 2013-07-01 01:24:04 UTC
(In reply to comment #4)
> among other issues, I think we should not rely on include_once and have the
> mergeMessageFileList check for and omit any missing extension files.
> 
> see https://gerrit.wikimedia.org/r/71056

That is fine, but note that it wouldn't have avoided downtime in the recent case. We have to abort when there is a PHP fatal error, which causes include_once() to return false, so we have to check the return value from include_once. But WikibaseDataModel does an explicit "return" from the file scope, which causes include_once() to return null. That's why mergeMessageFileList.php continued to fail even after WikibaseDataModel was checked out.
Comment 6 Tim Starling 2013-07-01 04:02:36 UTC
The bug is fixed as described, I confirmed it on the live site. The problem with the WikibaseDataModel extension registration can be split out to another report if necessary.
Comment 7 Tim Starling 2013-07-01 04:03:10 UTC
It was fixed in https://gerrit.wikimedia.org/r/#/c/71313/
Comment 8 Gerrit Notification Bot 2013-07-26 05:17:51 UTC
Change 71056 abandoned by Tim Starling:
(bug 50347) have mergeMessageFileList check if extension files in file-list are available

Reason:
Per author.

https://gerrit.wikimedia.org/r/71056

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links