Last modified: 2014-06-21 19:57:55 UTC
See http://lists.wikimedia.org/pipermail/mediawiki-l/2013-April/040970.html I remember that WMF had a similar problem, is there a solution apart from dropping the old jobs?
*** Bug 46971 has been marked as a duplicate of this bug. ***
Quoting from the duped bug: (Quoting bug 46971 comment #0) > After I upgraded from 1.20.3, in my database (MySQL) some > pre-upgrade jobs have job_random set to 0 and do not seem to be > picked up -not even when I try to run them by providing their type > as an option: php runJobs.php --type=replaceText.
Do we seriously have no way to fix this? Should we just tell people not to upgrade if they have something in the job queue?
Aaron: Could you comment on this, please?
Aaron: Could you comment on this please, as the 1.21 tarball release is imminent?
(In reply to comment #3) > Do we seriously have no way to fix this? Should we just tell people not to > upgrade if they have something in the job queue? I'd like to include something in the installation information telling users to clear their job queue before upgrading, but I don't know a lot about this. What would be appropriate? I would like to get this fixed ASAP for a point release.
(In reply to comment #6) > (In reply to comment #3) > > Do we seriously have no way to fix this? Should we just tell people not to > > upgrade if they have something in the job queue? > > I'd like to include something in the installation information telling users > to > clear their job queue before upgrading, but I don't know a lot about this. > What would be appropriate? > > I would like to get this fixed ASAP for a point release. Running maintenance/runJobs.php should clear the job queue. But depending on how long upgrading takes, one or two jobs might still be lost. Maybe put the wiki into read-only mode once the job queue is cleared?
Could maintenance/runJobs.php be run with the wiki in read-only mode?
(In reply to comment #8) > Could maintenance/runJobs.php be run with the wiki in read-only mode? Nope, I meant making it read-only after clearing the job queue. Still not a complete solution, but I can't think of anything else.
(In reply to comment #0) > See http://lists.wikimedia.org/pipermail/mediawiki-l/2013-April/040970.html > > I remember that WMF had a similar problem, is there a solution apart from > dropping the old jobs? Did we? The only problem is that they have a harder time getting picked if there always a bunch of other new jobs. In master and REL1_21 I've tried adding a bunch of jobs and setting the token to 0 for all of them, and runJobs.php works just fine. They weren't lost for me.
(In reply to comment #10) > In master and REL1_21 I've tried > adding > a bunch of jobs and setting the token to 0 for all of them, and runJobs.php > works just fine. They weren't lost for me. I mean job_random of course, not the token.
(In reply to comment #11) > I mean job_random of course, not the token. Ah. :) Well, isn't setting job_random exactly what the user in comment 0 didn't do and should have done?
(In reply to comment #12) > (In reply to comment #11) > > I mean job_random of course, not the token. > > Ah. :) Well, isn't setting job_random exactly what the user in comment 0 > didn't > do and should have done? The complaint was that 0 valued ones didn't work, so I set all mine to that value and they still worked.
hexmode said this wasn't deemed worth fixing for 1.21 release. I don't know the reasons, but better a partial update than nothing.
(In reply to comment #14) > hexmode said this wasn't deemed worth fixing for 1.21 release. I don't know > the reasons, but better a partial update than nothing. What I meant is that this isn't going to stop 1.21.0 from being released. It is still a valid bug that should be fixed at some point. If we can get it fixed in a 1.21 point release that would be great. I don't know enough about the problem to fix it, though.
WikiApiary still had hundreds of lingering jobs since October, till they dropped them from the DB yesterday because it was impossible to run them. We're still waiting for a general solution to the migration problems. http://lists.thingelstad.com/pipermail/wikiapiary-l/2014-February/000104.html http://lists.thingelstad.com/pipermail/wikiapiary-l/attachments/20140202/fd838fb7/attachment-0002.png http://lists.thingelstad.com/pipermail/wikiapiary-l/attachments/20140202/fd838fb7/attachment-0003.png
I had a similar thing happen. We were running 1.16.3 (yeah!) and upgraded to 1.23. Made a copy of the 1.16.3 database, pointed new 1.23wmf11 installation to new database. Ran update.php, everything is hunky-dory. Notice the next day that job queue is backed up. A few old jobs existed from the day of the upgrade (in 1.16.3). Tried clearing job_token and a few would run. Tried clearing out old pre 1.23 jobs, still not running. showJobs.php spits out 0, while api query and db shows jobs in the queue.
(In reply to comment #17) > I had a similar thing happen. We were running 1.16.3 (yeah!) and upgraded to > 1.23. Made a copy of the 1.16.3 database, pointed new 1.23wmf11 installation > to > new database. Ran update.php, everything is hunky-dory. Notice the next day > that job queue is backed up. A few old jobs existed from the day of the > upgrade > (in 1.16.3). Tried clearing job_token and a few would run. Tried clearing out > old pre 1.23 jobs, still not running. showJobs.php spits out 0, while api > query > and db shows jobs in the queue. What types of jobs? What to some of the rows look like?
Here's the job table from one of our wikis (total separate installations, but configured identically) http://pastebin.com/i4Qatgpa
Note, this is after I removed some of the older jobs in an attempt to 'kick start' the queue.
What does <<php showJobs.php --group>> show? What about <<php showJobs.php --list>>? Does <<php runJobs.php --type refreshLinks>> actually run anything?
Nothing appears to run. http://i.imgur.com/eypighU.jpg
I should note that certain actions on the site, such as modifying a template or running refreshLinks.php appears to not only add new jobs to the queue (as it should) but I can also run "runJobs.php" and a number of jobs will process. I don't see a pattern or commonality in what is run however.
Another note (and someone tell me to shut up if this isn't proper etiquette) it appears that running php refreshLinks.php will queue up a number of jobs. Running runJobs.php afterward will kick off many jobs (in a queue of 2000, 1500 or so) but does not complete all jobs as a result of refreshLinks.php or any of the other jobs queued.
What is $wgJobTypeConf set to? All the jobs you posted had at least 1 attempt. They won't run again until the claim TTL is reached. I don't know what you set that too. By default, jobs the fail are never retried and get deleted after a week. You can try using: $wgJobTypeConf['default']['claimTTL'] = 3600; // 1 hour ...this will let the jobs be retried (after 1 hour of failure). You can also set: $wgDebugLogGroups['runJobs'] = "<some path>" ...this will log all jobs run, and may show some failures (fatal errors will not show here though).
Aaron, can you explain a little more about the claim TTL? I don't recall setting that anywhere. I'm trying to pinpoint the cause as much as I can without touching my prod environment. We didn't have this issue in test (berating me for not having more sophisticated QA is justified!) and in order to make changes to prod I have many hoops I must jump through now. One interesting note is that trying to specify the --memory-limit when running runJobs.php throws an error. I'm beginning to think this might be related to available ram. (Server has 2gb, php.ini has 128mb) php runJobs.php --memory-limit 1024 PHP Fatal error: Allowed memory size of 262144 bytes exhausted (tried to allocate 122880 bytes) in /var/www/html/w/includes/AutoLoader.php on line 290
If the only jobs that linger have job_attempts set to something other than zero, then this is just a problem of failed jobs. You'd want to set wgDebugLogGroups as above to possibly get more insights on why jobs are failing sometimes. Jobs can fail for any number of reasons, mostly specific to the code the job classes run. That in itself doesn't indicate any problem with the job queue itself.
After changing $wgCliPhp to be blank my job queue now runs and clears out jobs without issue. $wgPhpCli = ""; As discovered in this thread: https://www.mediawiki.org/w/index.php?title=Project:Support_desk&offset=20140227154556&lqt_mustshow=40130#Mediawiki_1.22.2B_Causes_two_of_my_servers_to_hang_indefinitely._37734
Removing target milestone that was in the past. If you want this in a specific release, have a good reason AND you are willing to find resources to fix this bug, feel free to change it to something appropriate.