Last modified: 2013-01-11 05:53:15 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T45341, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 43341 - Re-enable Squid updates in HTMLCacheUpdate
Re-enable Squid updates in HTMLCacheUpdate
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
wmf-deployment
All All
: High normal (vote)
: ---
Assigned To: Aaron Schulz
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-12-22 04:15 UTC by MZMcBride
Modified: 2013-01-11 05:53 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description MZMcBride 2012-12-22 04:15:38 UTC
In r96834, Tim disabled Squid updates temporarily. According to <https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=includes/job/jobs/HTMLCacheUpdateJob.php;h=55c9f991d9e3b731d463d057b841f30f5e9ab8c0;hb=refs/heads/wmf/1.21wmf6>, Squid updates are still disabled. I believe this should be addressed.
Comment 1 Tim Starling 2012-12-30 22:38:40 UTC
Your wording suggests that you don't know what that code does. The change is for a specific kind of Squid update.

My change disabled Squid updates for pages that use a given template or image when there are more than 500 pages to purge. Re-enabling it would have caused the problem to recur: specifically apache overload due to the squid cache of a substantial fraction (say 10%) of the pages on the English Wikipedia being simultaneously purged.

Aaron's recent changes in this area caused updates of small sets of pages to also be disabled. They also made updates of templates with millions of invocations more robust, making it even more likely to cause disaster if the Squid update were simply re-enabled.
Comment 2 MZMcBride 2012-12-30 22:54:15 UTC
(In reply to comment #1)
> Your wording suggests that you don't know what that code does. The change is
> for a specific kind of Squid update.

Quite right. The symptom I'm experiencing is that I'll occasionally reach pages such as <http://en.wikipedia.org/wiki/Wikipedia:SCOTUSWORK> that are outdated, while the target of the redirect (in this case, <http://en.wikipedia.org/wiki/Wikipedia:WikiProject_U.S._Supreme_Court_cases/Reports>) is up-to-date. This only happens when I'm logged out and appears to only happen with redirects. When I'm logged in, the page content served (via a redirect or via the target of a redirect) is always up-to-date.

To me, this suggests that Squid cache is not updating properly. The bugs listed as "see also"s to this bug (bug 29552 and bug 38879) seem to suggest Squid cache may be to blame as well. It was a comment at bug 38879 (specifically bug 38879 comment 11) that pointed to this live hack as a possible culprit, so I filed a separate bug for further investigation. If you believe this bug is simply a duplicate of bug 29552 or bug 38879 or some other bug, feel free to mark it as such.

Sorry my initial bug report wasn't clearer. I was taking a shot in the dark in an attempt to to get the problem I'm experiencing resolved.
Comment 3 MZMcBride 2012-12-30 23:08:38 UTC
Just putting this here so I don't lose it forever:

me> TimStarling: I'm able to reproduce that bug fairly easily with a noticeboard, BTW. http://en.wikipedia.org/wiki/Wikipedia:BN reads "This page was last modified on 29 December 2012 at 23:40." while http://en.wikipedia.org/wiki/Wikipedia:Bureaucrats%27_noticeboard reads "This page was last modified on 30 December 2012 at 07:32."
me> TimStarling: Both requests are logged out.

Tim> yes, updates to redirects also go via that class
Tim> obviously there should be a limit to the number of pages that are purged simultaneously, instead of just disabling everything
Comment 4 MZMcBride 2012-12-31 00:39:50 UTC
(In reply to comment #2)
> It was a comment at bug 38879 (specifically bug 38879 comment 11) that
> pointed to this live hack as a possible culprit, so I filed a separate
> bug for further investigation.

Also bug 29552 comment 15 (though it's the same author).

I wrote a quick script that prints the "This page was last modified on " and "Served by " text for specified input pages. I tested with the following pairs:

----
base_url = 'http://en.wikipedia.org/wiki/'

pairs = [['Wikipedia:AN/I', 'Wikipedia:Administrators%27_noticeboard/Incidents'],
         ['Wikipedia:ANI', 'Wikipedia:Administrators%27_noticeboard/Incidents'],
         ['Wikipedia:BN', 'Wikipedia:Bureaucrats%27_noticeboard'],
         ['Wikipedia:BNB', 'Wikipedia:Bureaucrats%27_noticeboard']]
----

The results were:

----
http://en.wikipedia.org/wiki/Wikipedia:AN/I
18 December 2012 at 22:20.<br 
mw22 in 0.196 secs. --
http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Incidents
31 December 2012 at 00:32.<br 
srv210 in 0.177 secs. 
http://en.wikipedia.org/wiki/Wikipedia:ANI
30 December 2012 at 19:32.<br 
srv196 in 0.175 secs. 
http://en.wikipedia.org/wiki/Wikipedia:Administrators%27_noticeboard/Incidents
31 December 2012 at 00:32.<br 
srv210 in 0.177 secs. 
http://en.wikipedia.org/wiki/Wikipedia:BN
29 December 2012 at 06:37.<br 
mw52 in 1.415 secs. --
http://en.wikipedia.org/wiki/Wikipedia:Bureaucrats%27_noticeboard
31 December 2012 at 00:32.<br 
mw37 in 0.202 secs. --
http://en.wikipedia.org/wiki/Wikipedia:BNB
31 December 2012 at 00:32.<br 
srv229 in 0.153 secs. 
http://en.wikipedia.org/wiki/Wikipedia:Bureaucrats%27_noticeboard
31 December 2012 at 00:32.<br 
mw37 in 0.202 secs. --
----

Full script and output here: <http://p.defau.lt/?BW9Xkb3KfYjiUhFQ88Zegw>.
Comment 5 Aaron Schulz 2013-01-03 00:57:27 UTC
The wmf patch that is applied to master to make wmf branches causes this breakage. Its removed in https://gerrit.wikimedia.org/r/#/c/42055/1.

A proper core patch is in https://gerrit.wikimedia.org/r/#/c/42061/.
Comment 6 Bawolff (Brian Wolff) 2013-01-03 11:24:00 UTC
(In reply to comment #5)
> The wmf patch that is applied to master to make wmf branches causes this
> breakage. Its removed in https://gerrit.wikimedia.org/r/#/c/42055/1.
> 
> A proper core patch is in https://gerrit.wikimedia.org/r/#/c/42061/.

Yay! Out of curiosity, any idea roughly how large $wgMaxBacklinksInvalidate is going to be set to on Wikimedia wikis?
Comment 7 MZMcBride 2013-01-03 23:53:40 UTC
(In reply to comment #5)
> A proper core patch is in https://gerrit.wikimedia.org/r/#/c/42061/.

I don't understand this patch. From what I can tell, it seems to completely skip HTML cache invalidation if there are a lot of backlinks, but that wouldn't make any sense.
Comment 8 Aaron Schulz 2013-01-11 05:53:15 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > The wmf patch that is applied to master to make wmf branches causes this
> > breakage. Its removed in https://gerrit.wikimedia.org/r/#/c/42055/1.
> > 
> > A proper core patch is in https://gerrit.wikimedia.org/r/#/c/42061/.
> 
> Yay! Out of curiosity, any idea roughly how large $wgMaxBacklinksInvalidate
> is
> going to be set to on Wikimedia wikis?

200000.

This was now merged and deployed (with the old patch removed).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links