Last modified: 2013-09-27 09:37:19 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T47007, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 45007 - Update special pages more frequently to account for bad runs


Summary:	Update special pages more frequently to account for bad runs

Status:	RESOLVED DUPLICATE of bug 53227

Product:	Wikimedia
Classification:	Unclassified
Component:	General/Unknown (Other open bugs)
Version:	wmf-deployment
Hardware:	All All

Importance:	Normal enhancement (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2013-02-14 16:28 UTC by Malafaya
Modified:	2013-09-27 09:37 UTC (History)
CC List:	4 users (show)

See Also:	53227
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Malafaya 2013-02-14 16:28:49 UTC

Recently, the special pages update jobs have been having some trouble in actually finishing their work.
About 50% of the times, the jobs are terminated by some fatal error (no pattern, from the reasons I've been told), either because there's a stubborn wiki whose database tables grew too big or a bad update has been put live just when the jobs were still running.
To account for these bad runs, I would like to suggest running the jobs every 1, 1.5 or 2 days, instead of the current 3. If a wiki is intended, my specific case applies to pt.wiktionary.
Thank you.

Comment 1 Malafaya 2013-02-14 17:10:36 UTC

MaxSem said the job run yesterday (13th) when according to the 3 days schedule it should only have run today (14th; last run was 11th at 00:00UTC).
It finished with an error:

/home/wikipedia/logs/norotate/updateSpecialPages.log:

Fatal error: Call to a member function getText() on a non-object in /usr/local/apache/common-local/php-1.21wmf9/extensions/MobileFrontend/includes/MobileContext.php on line 273

Log file modified: 2013-02-13 05:17:24.374378000 +0000

Comment 2 db [inactive,noenotif] 2013-02-14 20:27:17 UTC

Job should automatic report this in the server admin log, than some people can see the errors and maybe fix it. LocalisationUpdate is reporting success, maybe this job can do that also.

Comment 3 Malafaya 2013-02-15 18:19:52 UTC

Fixing the problem usually happens shortly after the error is thrown. But that won't fix the special pages update, which will have to wait at least until the next run (3 more days, if the next run happens to be successful).

Comment 4 Daniel Zahn 2013-08-20 00:32:12 UTC

< Danny_B> update of special pages is off now?
< Danny_B> or the periods have been prolonged?
..
< mutante>  monthday => "*/3"
< mutante> hour => 5
..
< mutante> ./manifests/misc/maintenance.pp
< mutante> class misc::maintenance::update_special_pages

< Danny_B> so it doesn't run obviously
< Danny_B> last update: 13. 8. 2013, 14:15

< mutante> command => "flock -n /var/lock/update-special-pages /usr/local/bin/update-special-pages > /home/wikipedia/logs/norotate/updateSpecialPages.log 2>&1",
< mutante> uhm, yeah, i don't know about the commandline

< Reedy> Never happy
< Danny_B> anyway, in case it would be helpful to track down the issue - cs wikis lack the update

site.pp

 < mutante> 1178     # Wrong log file location
 < mutante> 1179     class { misc::maintenance::update_special_pages: enabled => true }
 < mutante> 2762     # Broken cron jobs moved back to hume:
< mutante> 2765     class { misc::maintenance::update_special_pages: enabled => false }



< mutante> so, the enabled one is on hume in site.pp
< mutante> not on the new host terbium

< mutante> !createbug

< mutante> cat: /home/wikipedia/logs/norotate/updateSpecialPages.log: No such file or directory

Comment 5 Sam Reed (reedy) 2013-08-23 14:10:12 UTC

reedy@hume:/home/wikipedia/log/norotate$ flock -n /var/lock/update-special-pages /usr/local/bin/update-special-pages > /home/wikipedia/logs/norotate/updateSpecialPages.log 2>&1
reedy@hume:/home/wikipedia/log/norotate$

Comment 6 Gerrit Notification Bot 2013-08-23 14:12:31 UTC

Change 80560 had a related patch set uploaded by Reedy:
Maintenance scripts should be run as Apache

https://gerrit.wikimedia.org/r/80560

Comment 7 Gerrit Notification Bot 2013-08-23 14:13:43 UTC

Change 80560 abandoned by Reedy:
Maintenance scripts should be run as Apache

Reason:
user => "apache",

https://gerrit.wikimedia.org/r/80560

Comment 8 Sam Reed (reedy) 2013-08-23 14:22:02 UTC

I'm not sure how just running it more frequently would make it any more likely to complete successfully. You're just going to make more fails more frequently.

Ideally, if it dies doing one wiki, this shouldn't stop execution on every other subsequent wiki (which has been an issue in the past)

Comment 9 Sam Reed (reedy) 2013-08-23 14:27:56 UTC

Monitor the current manual run via http://noc.wikimedia.org/~reedy/updateSpecialPages.log

Comment 10 Malafaya 2013-08-30 10:35:52 UTC

(In reply to comment #8)
> I'm not sure how just running it more frequently would make it any more
> likely
> to complete successfully. You're just going to make more fails more
> frequently.
> 
More fails are likely. But running every 3 days makes it quite frequent to have 6 or 9 days without a special page update. Right now it's been 6 days and the Wanted Categories page hasn't been updated at pt.wiktionary (I think the last update was your manual run). And I'm betting it won't be today either. So, another 3 days will have to pass for another go.

Comment 11 Nemo 2013-09-27 07:36:37 UTC

(In reply to comment #8)
> I'm not sure how just running it more frequently would make it any more
> likely
> to complete successfully. You're just going to make more fails more
> frequently.

So, closing this as a duplicate of bug 53227: let's keep one bug per issue, not one per proposed way to address it. Bug 53227 also shows that the diagnosis behind this proposal is probably wrong, because failures seem consistent rather than occasional, when there are failures.

> Ideally, if it dies doing one wiki, this shouldn't stop execution on every
> other subsequent wiki (which has been an issue in the past)

This is maybe worth a separate bug? If the scripts can't be improved easily, it should be rather easy to make the cronjobs more atomic.

*** This bug has been marked as a duplicate of bug 53227 ***

Comment 12 Malafaya 2013-09-27 09:33:54 UTC

If the runs become more reliable than in the past then surely this doesn't make much sense anymore. Let's go with bug 53227 for now.

Comment 13 Malafaya 2013-09-27 09:37:19 UTC

P.S.:

> Bug 53227 also shows that the diagnosis
> behind this proposal is probably wrong...

When I submitted this bug in February, bug 53227 was still not an issue at that time. The constant (bad) live updates were the problem then (see Comment #1).

Note You need to log in before you can comment on or make changes to this bug.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links