Last modified: 2011-03-09 21:14:51 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T4856, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 2856 - Squid HTTP_X_FORWARDED_FOR header not respected
Squid HTTP_X_FORWARDED_FOR header not respected
Status: RESOLVED WORKSFORME
Product: MediaWiki
Classification: Unclassified
User login and signup (Other open bugs)
1.5.x
All All
: Low major (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch, patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-07-14 13:34 UTC by David A. Desrosiers
Modified: 2011-03-09 21:14 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Squid HTTP_X_FORWARDED_FOR header patch (576 bytes, patch)
2005-07-14 13:38 UTC, David A. Desrosiers
Details

Description David A. Desrosiers 2005-07-14 13:34:40 UTC
I run Apache behind Squid accellerators which sets a header of
HTTP_X_FORWARDED_FOR to represent the remote client IP address (instead of
Apache's REMOTE_ADDR header). MediaWiki's test for this is incorrect (fix
below). I tried using $wgUseSquid = true; in LocalSettings.php which did not
solve the problem for me. 

In the file includes/ProxyTools.php there is a test for the header REMOTE_ADDR,
which will always be true, even when running behind Squid. In my case, the IP
reported for all clients is 127.0.0.1, which isn't useful when trying to track
down abuse of the wiki from outsiders. 

I updated includes/ProxyTools.php to test for HTTP_X_FORWARDED_FOR first, then
REMOTE_ADDR. If HTTP_X_FORWARDED_FOR is found, it is used. If it isn't found,
REMOTE_ADDR is tested and used instead. The current test never reaches the
second clause, and this fixes that condition. 

Below is my fixed stanza, in full. I've tested this and it works perfectly as
expected: 

        /* collect the originating ips */
        # Client connecting to this webserver
        if ( isset( $_SERVER['HTTP_X_FORWARDED_FOR'] ) ) {
                $ipchain = array( $_SERVER['HTTP_X_FORWARDED_FOR'] );

        } else if ( isset( $_SERVER['REMOTE_ADDR'] ) ) {
                $ipchain = array( $_SERVER['REMOTE_ADDR'] );

        } else {
                # Running on CLI?
                $ipchain = array( '127.0.0.1' );
        }
        $ip = $ipchain[0];
Comment 1 David A. Desrosiers 2005-07-14 13:38:27 UTC
Created attachment 712 [details]
Squid HTTP_X_FORWARDED_FOR header patch
Comment 2 Brion Vibber 2005-07-14 20:17:32 UTC
Can you confirm that you've set up the proxy settings for MediaWiki correctly?

We run the entire Wikimedia site behind squid proxies, and have no problems with this.
Comment 3 David A. Desrosiers 2005-07-15 13:00:58 UTC
LocalSettings.php has one line that seems to be related to a proxy, and that is
$wgProxyKey, which doesn't have any comments associated with it. A quick grep of
the sources doesn't provide much info on its purpose either. 

What proxy settings would I need to check? The whole proxy configuration is 100%
transparent on Apache and Squid, except for the addition of this header. Squid
runs on port 80, Apache runs on port 80. There are no ProxyPass directives in
Apache to point to Squid and vice versa. 

Also, if this required specific Proxy handling, this would be the only
application out of over 200 we run here that needs to know its running behind a
proxy. Every application we run doesn't seem to care, except in the case where
they need to know the REMOTE_ADDR of the incoming user. In this case, they need
to look for that information in HTTP_X_FORWARDED_FOR, not REMOTE_ADDR. 

The reason for filing this bug is obvious, if you look at the logic flow. You're
always checking REMOTE_ADDR first, which will ALWAYS exist. Its just that in the
case of running behind Squid, that header's value will contain 127.0.0.1 (or
whatever the non-public, internal IP of your squid proxy runs on). The proper
checking order should look for HTTP_X_FORWARDED_FOR first then REMOTE_ADDR. If
you're not running behind Squid or another proxy, HTTP_X_FORWARDED_FOR will not
exist, and REMOTE_ADDR can be trusted. If HTTP_X_FORWARDED_FOR does exist, then
its value should be used. Using REMOTE_ADDR in every case is flat-out wrong. 

Another possible solution is mod_rpaf -> http://stderr.net/apache/rpaf/

But I don't want to run more modules than I have to, and certainly not for 1
single application that does not behave like the other 200 or so applications. 
Comment 4 Brion Vibber 2005-07-15 18:33:27 UTC
See DefaultSettings.php for all configurable settings.

LocalSettings.php contains only whatever overrides you've put into it.
Comment 5 alex 2005-08-30 12:30:36 UTC
This bug also affects Cisco caches in use by at least one large ISP in
Australia. I found myself unable to edit for 24 hours due to a block being
placed on the transparent proxy server the isp uses. Here is their explanation:
--------------------------------------------------------------------------
IP-Based Authentication

Customers will often request that we bypass a particular website based on
the fact that it uses IP-based authentication for hosts. If the website
code has not been built upon web standards, they may only be checking for
the address of the remote requestor, rather than that supplied by the
X-Forwarded-For address.

Details on how to retrieve the X-Forwarded-For address are included here:

PHP: http://www.php.net/getenv

ASP: Request.ServerVariables("HTTP_X-Forwarded-For")

Example: http://www.lagado.com/proxy-test

The Cisco cache engines used by Internode will pass an X-Forwarded-For 
variable to the server containing the original requestor IP address.

When the cache engines intercept a request, they will replace the 
Client-IP field with their own IP address and store and supply the 
original requestor's IP address in the X-Forwarded-For field.

X-Forwarded-For is an unofficial standard implemented originally by
Squid (http://www.squid-cache.org/) and adopted by many other 
caching technologies since that time, including Cisco, who are the
vendor of the Cache engines we use on the Internode network.

You should also know that because you are using IP-Based authentication
for your website, you will potentially encounter issues outside of our
network, where customer requests coming from other ISPs also using 
cache engines may also be supplying the cache engine's address in
in the Client-IP field.

We prefer not to implement bypasses where a problem can be worked 
around by other means, as the more we add, the higher the long-term
load on the cache engines becomes.  It is also notable that any cache
exemption we implement is not considered permanent, and may be removed
without notice at any time.

If at all possible, it would be better in the long-term to look at 
implementing code to detect the IP address from either the 
X-Forwarded-For address in addition to the Client-IP address field.
This should resolve any future issues you encounter with transparently
cached requests from most other networks.

Examples of using the X-Forwarded-For header can be shown below:

PHP:  http://www.php.net/getenv
ASP:  Using "Request.ServerVariables("HTTP_X-Forwarded-For")"

A website giving a fair bit of information about which variables
are supplied on request is at http://www.lagado.com/proxy-test.
As you can see, this website is retrieving all of the appropriate
fields, showing that the code of any website should be able to
detect the IP address of the original requestor, whether or not it is
transparently cached.

At this stage, we have not implemented the bypass to give you a
chance to review this information.  Please get back to us if
there is a particular reason in which this option can not be
implemented, and we will then review the request.

I hope this helps and shows that the problem does lie at Wikipedias end. 
----------------------------------------------------------------------
Comment 6 Brion Vibber 2005-08-30 22:13:38 UTC
We do respect X-Forwarded-For on known trusted servers (in particular
our own). Note that as client-provided data it CANNOT BE TRUSTED IN
GENERAL. An attacker can trivially create false X-Forwarded-For headers
with any arbitrary contents, which if blindly accepted would obscure
their real address (making them impossible to block) or falsely blame
some other address, perhaps with the intention of getting administrators
to block some other user.
Comment 7 alex 2005-08-31 02:31:38 UTC
Whilst this works for some, I don't believe this bug should be closed as I believe 
there are large numbers of users who sit behind transparent proxies at large ISPs. 
The only solution for them if they share the proxy with a WikiVandal is to change 
ISPs which is unsatisfactory. 

I agree with the point the the X-Forwarded-For header cannot be trusted and is 
easily faked. A more sophisticated solution would be to record the X-Forwarded-For 
header AS WELL AS the REMOTE_ADDR from the server. Thus the unsophisticated 
vandals can be dealt with as well as allowing normal users to continue. This will 
also make it easier to track down vandals with the co-operation of ISPs as you can 
identify that it came from that ISP (via REMOTE_ADDR) and you have the vandals 
session IP (or that of an open proxy they might be using) via HTTP_X-Forwarded-For.
This would require changes to the code for blocking IPs to allow for
1) Block a REMOTE_ADDR (currently implemented)
2) Block a HTTP_X-Forwarded-For address via a REMOTE_ADDR (the patch implements 
this, not currently implemented, suits unsophisticated and probably most common 
vandals). Will not affect normal users behind the same proxy
3) Don't trust HTTP_X-Forwarded-For via a particular REMOTE_ADDR (not currently 
implemented, requires an additional table of blocked proxies, rampant vandal, but 
you would record and have the HTTP_X-Forwarded-For header which will be useful for 
the ISP in identifying the vandal and terminate them via their TOS, until which no 
users at that ISP can edit).

I realise that this is now not a simple patch as it impacts other areas of code 
and the operation of blocking functions if implemented this way. It would require 
the creation of an additional table of blocked proxies and possibly trusted 
proxies as well to cater for the case mentioned by Brion above. I do think though 
that there are large numbers of editors who would benefit from a change such as 
this.

At a minimum we really need some data as to the extent of use of transparent 
proxies by editors, readers (potential editors) and vandals by collecting on one 
or two servers the use of the HTTP_X-Forwarded-For header. There are also variants 
on HTTP_X-Forwarded-For that would need to be catered for as well.

Can we collect some data and then make a decision?
Comment 8 Brion Vibber 2005-09-02 05:12:37 UTC
Closing this bug as WORKSFORME since the case of the original reporter is what we do, 
constantly, all day every day and it works fine.

If you wish to request some different treatment of external proxies please open a new 
enhancement bug.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links