Last modified: 2009-05-26 06:03:26 UTC
It is possible for api.php to return the "wikimedia foundation error" HTML page. This appears to be a transient error. The api should return an xml error message (or better, follow the format= parameter) instead of the html error page. Adding another possible format for output makes it much more difficult to write reliable code for the api.
What is the request you were making? Also, please attach the html of that page. Api should handle all internal exceptions already, so this will help track it down.
I was fetching user contributions. The URL was of the form en.wikipedia.org/w/api.php&action=query&list=usercontribs&uclimit=1000&ucstart=...&ucuser=... Of course I had to be logged in to do this. I didn't keep a copy of the HTML; if I see the error again, I'll attach it.
This sounds like an error coming from Squid, which would be above the control of MediaWiki. There *ought* to be appropriate response codes being set, however.
Finally I have some details on this. The HTTP error code returned is 502 as Rob Church said. The error page includes the following: Request: POST http://en.wikipedia.org/w/api.php, from 66.230.200.132 via sq22.wikimedia.org (squid/2.6.STABLE12) to 10.0.5.3 (10.0.5.3)<br/> Error: ERR_ZERO_SIZE_OBJECT, errno [No Error] at Tue, 21 Aug 2007 01:58:06 GMT To get this, I put a lot of load on the usercontribs query. I can get the same error (HTTP 502) Request: POST http://en.wikipedia.org/w/api.php, from 66.230.200.137 via sq22.wikimedia.org (squid/2.6.STABLE12) to 10.0.5.3 (10.0.5.3)<br/> Error: ERR_ZERO_SIZE_OBJECT, errno [No Error] at Tue, 21 Aug 2007 02:06:11 GMT by using the action=allusers query to list sysops on enwiki. The ideal behavior (for me) would be an XML error message with HTTP code 404 instead of HTTP code 502. This would agree with the API docs "For now we decided to include error information inside the same structured output as normal result (option #2)." and "All output will be available in a structured tree format such as XML, JSON, YAML, WDDX, or PHP serialized." I appreciate the 502 status return is likely the intended current behavior of the squids, but perhaps they could return a generic XML error page if they notice the query was for the API.
This is very weird, why is Squid even attempting to cache api.php results? Shouldn't the API implementation tell Squid its output is non-cacheable? (In reply to comment #4) > The ideal behavior (for me) would be an XML error message with HTTP code 404 > instead of HTTP code 502. > This would agree with the API docs "For now we decided to include error > information inside the same structured output as normal result (option #2)." > and "All output will be available in a structured tree format such as XML, > JSON, YAML, WDDX, or PHP serialized." > > I appreciate the 502 status return is likely the intended current behavior of > the squids, but perhaps they could return a generic XML error page if they > notice the query was for the API. > The XML error page thing is expected behavior for all API modules, and works for most of them. I personally disagree that a 404 error code should be specified, however: that would mean api.php doesn't exist. In my opinion, we shouldn't mess with HTTP response codes and return any errors in the XML output.
@Comment #4: As Rob said, the error is not comming from api.php, or MediaWiki at all. Your request is going through a Squid proxy, which is trying to forward it. If the forward fails, you get an error page from the squid, with an appropriate error code. For Errors on the HTTP level, you can not expect XML error messages. In fact, you can't expect anything beyond the correct HTTP error code. So, always check the error code - and ideally also always check the Content-Type reported from the server. For some error codes, the response *might* be generated by api.php, and *might* be in XML format. But you cannot rely on that, HTTP errors can happen before any PHP code is ever executed. This is true for all HTTP based APIs, including all REST and AJAX stuff, SOAP, etc. @Comment #5: Squid is not trying to cache the response from api.php, but to forward the request to api.php. All HTTP requests to wikimedia sites go through the Squids. After all, they don't only do caching, but also load balancing. And your browser couldn't know in advance which pages to request from squid and which not, and where from instead, anyway: it simply gets the Squid's IP address when resolving the host name, and consequently sends all requests to the squid. Which is the point of Squids, really. Closing as invalid, since this behavior is imposed by using HTTP transport, and complient with the HTTP spec.
*** Bug 13762 has been marked as a duplicate of this bug. ***
*** Bug 18920 has been marked as a duplicate of this bug. ***