Last modified: 2014-05-27 21:57:03 UTC
We want to cache gzipped responses in Varnish. For this to work all clients need to send an Accept-Encoding header that includes 'gzip'.
This is important for performance, so setting priority to reflect that. Currently this causes Varnish to store uncompressed HTML, which reduces the hit rate significantly.
Change 101454 had a related patch set uploaded by Jforrester: Accept-Encoding: gzip for read requests from Parsoid https://gerrit.wikimedia.org/r/101454
Varnish actually transparently decompresses stored compressed content and strips the supplied Accept header in its backend request. Since mid-November we always return compressed content, and Varnish stores it in compressed form for a much better hit rate. So the impact is not as high as I thought: 1) it causes some extra CPU load in Varnish for decompression 2) it results in a larger network transfer on the internal network Both have a very small performance impact. The Parsoid Varnish is basically idle CPU-wise and the internal network is fast. Lowering the priority to reflect this.
As far as I can tell from reading the code, this should currently only be necessary if the PHP cURL extension is not installed. (And it would also be necessary to decode the response you get back.) And really it should never be necessary - see bug 61507
And also I should mention that on my local setup with no Varnish in the middle, and cURL installed, MW requests a gzipped copy from Parsoid and receives it. And decodes it properly. If I force it to use the PHP engine instead of cURL, it requests a non-gzipped version and receives it fine. This should be fixed in core rather than VE having to change anything.
Appears intractable. :-(
Change 101454 abandoned by Jforrester: [WIP] Accept-Encoding: gzip for read requests from Parsoid Reason: Abandoning for now. https://gerrit.wikimedia.org/r/101454