Last modified: 2013-10-21 15:05:47 UTC
We unconditionally respect the X-Forwarded-For header that gets fed into kraken's machineries. Regardless of whether the client IP is a trusted one, or it is not a trusted one. This distorts our reports/graphs. Instead, we should only respect the X-Forwarded-For header for the client IPs in $wgSquidServersNoPurge in wmf-config/squid.php of operations/mediawiki-config.
Prioritization and scheduling of this bug is tracked on Mingle card https://mingle.corp.wikimedia.org/projects/analytics/cards/1191
Although Ops only seem to trust $wgSquidServersNoPurge in wmf-config/squid.php of operations/mediawiki-config (private Email with Faidon, and afterwards analytics-internal), the Wikipedia Zero team also seem to trust the X-Forwarded-For header also from Opera proxies (private emails that lead up to: https://raw.github.com/wikimedia/metrics/5fb67552555c32e4cd4b08b6c4d4ec264b07351f/pageviews/zero/pageview_zero.png ).
We currently trust XFF from all SSH and OPERA IPs listed at http://meta.wikimedia.org/wiki/Zero:-OPERA SSH is not yet handled by the Zero since all partners use DPI for whitelisting, effectively ignoring HTTPS traffic
(In reply to comment #3) > We currently trust XFF from all SSH [...] I do not know SSH in this context. What does it stand for?
(I don't understand what SSH means here either) squid.php is what MediaWiki considers as trusted as an XFF source (e.g. what would appear on IP edits). Ops doesn't have such whitelists -- apart from the very unusual & special Zero detection, we don't care about the values of XFF, so far. I think Analytics should move into a direction that makes sense for you from an analytics perspective (e.g. you even be able to tell us what other large proxies exist out there, purely by analyzing the request stream :) and we might find a way to converge such info in the future.
Note that MediaWiki also trusts some other proxies, see https://www.mediawiki.org/wiki/Extension:TrustedXFF