Last modified: 2012-10-02 01:20:03 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T22814, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 20814 - Enable $wgCrossSiteAJAXdomains for wikimedia sites
Enable $wgCrossSiteAJAXdomains for wikimedia sites
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
unspecified
All All
: Low enhancement with 5 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
: 30802 (view as bug list)
Depends on: 30881
Blocks: 20298
  Show dependency treegraph
 
Reported: 2009-09-25 20:24 UTC by Bawolff (Brian Wolff)
Modified: 2012-10-02 01:20 UTC (History)
19 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Bawolff (Brian Wolff) 2009-09-25 20:24:05 UTC
Enable $wgCrossSiteAJAXdomains for wikimedia sites. It would be useful to be able to access the api across wikimedia domains through js gadgets.

Setting it to something like
$wgCrossSiteAJAXdomains = array( '/http:\/\/[a-z\-]{2,}\.wikipedia\.org/', '/http:\/\/[a-z\-]{2,}\.wikinews\.org/', '/http:\/\/[a-z\-]{2,}\.wiktionary\.org/', '/http:\/\/[a-z\-]{2,}\.wikibooks\.org/', '/http:\/\/[a-z\-]{2,}\.wikiversity\.org/', '/http:\/\/[a-z\-]{2,}\.wikipedia\.org/', '/http:\/\/[a-z\-]{2,}\.wikisource\.org/', '/http:\/\/[a-z\-]{2,}\.wikiquote\.org/', '/http:\/\/(?!upload)[a-z\-]{2,}\.wikimedia\.org/' );

Note: you might want to check the last one. I assume allowing cross site access to upload.wikimedia.org = Bad Thing. I don't know if allowing access to anything.wikimedia.org but upload.wikimedia.org is ok. 

Cheers,
Bawolff
Comment 1 Roan Kattouw 2009-09-25 20:28:54 UTC
We want to be more restrictive for *.wikimedia.org , because there's a bunch of untrusted subdomains in there. We should explicitly list the ones we own.
Comment 2 Bawolff (Brian Wolff) 2009-09-26 05:46:03 UTC
Note: I found another way to do what i wanted without this enabled ( http://en.wiktionary.org/w/api.php?action=parse&prop=text&page=Wikimedia&format=xml&xslt=MediaWiki:extractFirst.xsl ), so i don't really need it . But it would probably still be useful to have it enabled .
Comment 4 Krinkle 2010-07-01 02:11:34 UTC
Also dont forget the secure subdomain. The better scripts dont link to the domain but use wgServer/wgScript.
Comment 5 Krinkle 2010-07-01 02:12:16 UTC
Also dont forget the secure subdomain. The better scripts dont link to the
domain but use wgServer/wgScript.

Such as https://secure.wikimedia.org/wikipedia/commons/wiki/Main_Page
Comment 6 Tim Starling 2011-05-10 00:33:14 UTC
This would break squid caching. I don't see a "Vary: Origin" header, so whichever subdomain requests a given cacheable object first will have an Access-Control-Allow-Origin header sent back with the origin subdomain in it. The header will be cached, so subsequent requests from different domains will be denied by the client.

Vary:Origin would be a disaster for caching anyway, since there's hundreds of internal domains, and external domains could potentially send this header also.

As for the code in api.php: the Origin header is a whitespace-separated list of origins. Running an unanchored case-sensitive regex against the whole string is not appropriate. Section 5.1 of the July 2010 CORS spec gives the correct algorithm:

http://www.w3.org/TR/2010/WD-cors-20100727/#resource-requests

One possible way to support CORS would be to require that the origin be specified in a URL parameter. If the URL parameter matches the Origin header, then the access control header can be sent with Vary: Origin. If it doesn't match, a 403 can be sent with CC: no-cache. If the URL parameter is missing, no Vary header or access control header is sent. This means that caching will only be broken to the extent necessary to support the feature.

Another way to do it would be to implement the whole feature in Squid. A custom response header from MediaWiki, similar to X-Vary-Options, would specify the complete list of allowable domains. Then Squid would handle setting the correct access control headers in a post-cache step.
Comment 7 Krinkle 2011-06-21 18:11:37 UTC
Just (In reply to comment #0)
> Setting it to something like
> $wgCrossSiteAJAXdomains = array( '/http:\/\/[a-z\-]{2,}\.wikipedia\.org/',
> '/http:\/\/[a-z\-]{2,}\.wikinews\.org/',
> '/http:\/\/[a-z\-]{2,}\.wiktionary\.org/',
> '/http:\/\/[a-z\-]{2,}\.wikibooks\.org/',
> '/http:\/\/[a-z\-]{2,}\.wikiversity\.org/',
> '/http:\/\/[a-z\-]{2,}\.wikipedia\.org/',
> [..]

Just for the record, not all subdomains are 2 characters. There's longer ones as well (nds, be-x-old, etc.) Although *.wikimedia.org is a problem, I think * is fine for the sisterprojects, right ? Atleast longer than {2}
Comment 8 Bawolff (Brian Wolff) 2011-06-21 21:33:05 UTC
(In reply to comment #7)
> Just (In reply to comment #0)
> > Setting it to something like
> > $wgCrossSiteAJAXdomains = array( '/http:\/\/[a-z\-]{2,}\.wikipedia\.org/',
> > '/http:\/\/[a-z\-]{2,}\.wikinews\.org/',
> > '/http:\/\/[a-z\-]{2,}\.wiktionary\.org/',
> > '/http:\/\/[a-z\-]{2,}\.wikibooks\.org/',
> > '/http:\/\/[a-z\-]{2,}\.wikiversity\.org/',
> > '/http:\/\/[a-z\-]{2,}\.wikipedia\.org/',
> > [..]
> 
> Just for the record, not all subdomains are 2 characters. There's longer ones
> as well (nds, be-x-old, etc.) Although *.wikimedia.org is a problem, I think *
> is fine for the sisterprojects, right ? Atleast longer than {2}

{2,} means 2 or more characters, so be-x-old would be fine.
Comment 9 Krinkle 2011-06-21 23:17:50 UTC
(In reply to comment #8)
> (In reply to comment #7)
> > Just (In reply to comment #0)
> > > Setting it to something like
> > > $wgCrossSiteAJAXdomains = array( '/http:\/\/[a-z\-]{2,}\.wikipedia\.org/',
> > > '/http:\/\/[a-z\-]{2,}\.wikinews\.org/',
> > > '/http:\/\/[a-z\-]{2,}\.wiktionary\.org/',
> > > '/http:\/\/[a-z\-]{2,}\.wikibooks\.org/',
> > > '/http:\/\/[a-z\-]{2,}\.wikiversity\.org/',
> > > '/http:\/\/[a-z\-]{2,}\.wikipedia\.org/',
> > > [..]
> > 
> > Just for the record, not all subdomains are 2 characters. There's longer ones
> > as well (nds, be-x-old, etc.) Although *.wikimedia.org is a problem, I think *
> > is fine for the sisterprojects, right ? Atleast longer than {2}
> 
> {2,} means 2 or more characters, so be-x-old would be fine.

Sorry my bad.  Why this restriction though ? What about ajax-niftyness in a future version of m.wikipedia.org ? I'm just a little unsure why there's limit/minimum put in there.
Comment 10 Bawolff (Brian Wolff) 2011-06-22 00:41:14 UTC
Honestly, it was such a long time ago I posted comment 0, I can't remember if there was any reason for that, or if it was just an automatic, lang codes are at least 2 letters type thing.
Comment 11 JeLuF 2011-06-23 09:12:29 UTC
According to Tim's comment, this is not just a configuration request but requires coding first => removed "shell" keyword
Comment 12 Roan Kattouw 2011-09-07 16:40:46 UTC
*** Bug 30802 has been marked as a duplicate of this bug. ***
Comment 13 Roan Kattouw 2012-06-01 12:56:05 UTC
(In reply to comment #6)
> One possible way to support CORS would be to require that the origin be
> specified in a URL parameter. If the URL parameter matches the Origin header,
> then the access control header can be sent with Vary: Origin. If it doesn't
> match, a 403 can be sent with CC: no-cache. If the URL parameter is missing, no
> Vary header or access control header is sent. This means that caching will only
> be broken to the extent necessary to support the feature.
> 
That's what I ended up doing, and I also fixed the Origin-header-can-contain-spaces issue.

The bulk of the changes are in https://gerrit.wikimedia.org/r/9624 . There are three smaller changes leading up to it as well; you can view them all at https://gerrit.wikimedia.org/r/#/q/project:mediawiki/core+branch:master+topic:apicors,n,z

If this passes muster, we can enable CORS on the live site once these changes are deployed.
Comment 14 Roan Kattouw 2012-08-29 18:39:44 UTC
(In reply to comment #13)
> If this passes muster, we can enable CORS on the live site once these changes
> are deployed.
It seems these changes have now been deployed, so next Tuesday I'll take a stab at enabling CORS for Wikimedia domains.
Comment 15 Roan Kattouw 2012-09-05 18:01:13 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > If this passes muster, we can enable CORS on the live site once these changes
> > are deployed.
> It seems these changes have now been deployed, so next Tuesday I'll take a stab
> at enabling CORS for Wikimedia domains.
It slipped to Wednesday instead of Tuesday, but this is now done! CORS is now working for me; tested by pasting the following code snippet into the JS console on English Wikipedia:

$.ajax( {
	'url': 'https://www.mediawiki.org/w/api.php',
	'data': {
		'action': 'query',
		'meta': 'userinfo',
		'format': 'json',
		'origin': 'https://en.wikipedia.org'
	},
	'xhrFields': {
		'withCredentials': true
	},
	'success': function( data ) {
		alert( 'Foreign user ' + data.query.userinfo.name +
			' (ID ' + data.query.userinfo.id + ')' );
	},
	'dataType': 'json'
} );
Comment 16 Helder 2012-09-05 18:27:27 UTC
Should this code be working also on pt.wikipedia? (it isn't)
Comment 17 Helder 2012-09-05 18:29:31 UTC
(In reply to comment #16)
> Should this code be working also on pt.wikipedia? (it isn't)
Specifically:
----
XMLHttpRequest cannot load https://www.mediawiki.org/w/api.php?action=query&meta=userinfo&format=json&origin=https%3A%2F%2Fen.wikipedia.org. Origin https://pt.wikipedia.org is not allowed by Access-Control-Allow-Origin.
----
Comment 18 Roan Kattouw 2012-09-05 18:32:18 UTC
(In reply to comment #17)
> (In reply to comment #16)
> > Should this code be working also on pt.wikipedia? (it isn't)
> Specifically:
> ----
> XMLHttpRequest cannot load
> https://www.mediawiki.org/w/api.php?action=query&meta=userinfo&format=json&origin=https%3A%2F%2Fen.wikipedia.org.
> Origin https://pt.wikipedia.org is not allowed by Access-Control-Allow-Origin.
> ----

You have to set the origin= query parameter correctly. Your URL contained &origin=https%3A%2F%2Fen.wikipedia.org , that needs to be &origin=https%3A%2F%2Fpt.wikipedia.org instead (this corresponds to the 'origin': 'https://en.wikipedia.org' line in my snippet).
Comment 19 Helder 2012-09-05 18:38:14 UTC
(In reply to comment #15)
> (In reply to comment #14)
> It slipped to Wednesday instead of Tuesday, but this is now done!

For the record: it was done on gerrit change Id715c280.

(In reply to comment #18)
> You have to set the origin= query parameter correctly. Your URL contained
> &origin=https%3A%2F%2Fen.wikipedia.org , that needs to be
> &origin=https%3A%2F%2Fpt.wikipedia.org instead (this corresponds to the
> 'origin': 'https://en.wikipedia.org' line in my snippet).

Got it! Sorry for the mistake.

BTW: I tried to use
        'origin': mw.config.get( 'wgServer' )
which corresponds to
        'origin': "//pt.wikipedia.org"
and it didn't work.
Comment 20 Roan Kattouw 2012-09-05 18:43:11 UTC
(In reply to comment #19)
> (In reply to comment #15)
> > (In reply to comment #14)
> > It slipped to Wednesday instead of Tuesday, but this is now done!
> 
> For the record: it was done on gerrit change Id715c280.
> 
Yes, I forgot to mention that. Thanks!

> Got it! Sorry for the mistake.
> 
> BTW: I tried to use
>         'origin': mw.config.get( 'wgServer' )
> which corresponds to
>         'origin': "//pt.wikipedia.org"
> and it didn't work.
Yeah, unfortunately the origin parameter requires that the protocol be specified correctly. It seems like something like 'origin': document.location.protocol + '//' + document.location.hostname should work.
Comment 21 Brett Zamir 2012-09-06 03:16:55 UTC
This is great news!  But when I try the exact code as in comment 15, I get an empty 403 Forbidden showing in Firebug. Any idea what could be happening?
Comment 22 Sergey Vladimirov 2012-09-06 04:06:15 UTC
This is great news!

Just added "wikificator" gadget (search article in wikipedia and create internal links) to ru-wikisource, and it works!

Sergey
Comment 23 Helder 2012-09-06 12:40:07 UTC
(In reply to comment #22)
> Just added "wikificator" gadget (search article in wikipedia and create
> internal links) to ru-wikisource, and it works!

For those interested, it is available here:
https://ru.wikisource.org/wiki/Special:PrefixIndex/MediaWiki:Gadget-wikilinker
Comment 24 Roan Kattouw 2012-09-06 16:34:27 UTC
(In reply to comment #21)
> This is great news!  But when I try the exact code as in comment 15, I get an
> empty 403 Forbidden showing in Firebug. Any idea what could be happening?
You have to adapt the 'origin' parameter to whatever the origin domain is. I was testing on English Wikipedia using HTTPS, so my example has 'origin': 'https://en.wikipedia.org', you'll need to change that as appropriate.
Comment 25 Brett Zamir 2012-09-07 01:02:02 UTC
Sorry, but I'm not getting it working with the origin parameter changed: http://brett-zamir.me/testCORS.html  . I am in China, so don't know if network issues here could be different, but the page I just listed is returning an error alert for me (I only changed the original code for the origin and to add an errback).
Comment 26 Derk-Jan Hartman 2012-09-07 06:10:56 UTC
@Brett, that's because that server is not enabled in wgCrossSiteAJAXdomains. If it were it would be a security risk. You can only do this between, in this case wikimedia sitesss, that you are logged into.
Comment 27 Brett Zamir 2012-09-07 13:47:29 UTC
Sorry to be so clueless here and not noticing the original comment about this--but what is the harm in providing some read-only access to other domains? JSONP is already exposed, so why is this not being exposed openly?
Comment 28 Roan Kattouw 2012-09-07 16:18:56 UTC
(In reply to comment #27)
> Sorry to be so clueless here and not noticing the original comment about
> this--but what is the harm in providing some read-only access to other domains?
> JSONP is already exposed, so why is this not being exposed openly?
JSONP is exposed, but locked down, and uses the browser's same-origin policy as part of the protection against CSRF. It would probably be possible to implement read-only CORS from non-Wikimedia domains, but that would be scary, easy to get wrong, and would remove a layer of protection that we currently have.

For the list of whitelisted origin domains (i.e. the list of domains from which you can make cross-domain AJAX requests to a WMF wiki), see https://gerrit.wikimedia.org/r/gitweb?p=operations/mediawiki-config.git;a=blob;f=wmf-config/CommonSettings.php;h=8a8952eeeb75a6a4b7133abc8a3c536d8ba24141;hb=HEAD#l764 . All wikis accept these cross-domain requests, except private wikis (i.e. wikis where people without accounts cannot read pages).
Comment 29 Tisza Gergő 2012-09-07 22:33:17 UTC
(In reply to comment #24)
> You have to adapt the 'origin' parameter to whatever the origin domain is. I
> was testing on English Wikipedia using HTTPS, so my example has 'origin':
> 'https://en.wikipedia.org', you'll need to change that as appropriate.

Why is it necessary to specify the origin in the URL? Couldn't you just use the Origin: header?
Comment 30 Roan Kattouw 2012-09-07 23:32:39 UTC
(In reply to comment #29)
> (In reply to comment #24)
> > You have to adapt the 'origin' parameter to whatever the origin domain is. I
> > was testing on English Wikipedia using HTTPS, so my example has 'origin':
> > 'https://en.wikipedia.org', you'll need to change that as appropriate.
> 
> Why is it necessary to specify the origin in the URL? Couldn't you just use the
> Origin: header?
It's necessary to make Squid caching continue to work. Not including the origin in the URL causes cache pollution. The origin parameter is actually validated against the Origin header too, and if they don't match, a 403 is served (with no-cache headers, of course).
Comment 31 Krinkle 2012-09-16 16:29:36 UTC
(In reply to comment #27)
> Sorry to be so clueless here and not noticing the original comment about
> this--but what is the harm in providing some read-only access to other domains?
> JSONP is already exposed, so why is this not being exposed openly?

For read-only access, use JSONP. JSONP works across any domain and is not affected by the same-origin  policy because it doesn't use XHR requests, but regular script requests (through a callback parameter). The API automatically puts itself in read-only anonymous user mode when accessing it through JSONP.

For pure JSON, the origin has to be trusted and write-access is allowed. For that kind of access the origin must be trusted.
Comment 32 Brett Zamir 2012-09-18 06:33:30 UTC
@Krinkle: Thanks, but it would really be nice to have the error checking of CORS. I presume Roan knows what he is talking about, but if it is true what you say that the "API automatically puts itself in read-only anonymous user mode when accessing it through JSONP", then wouldn't this mode just need to be switched on in the case of cross-domain CORS?

Btw, should this discussion be tracked in the likes of Bug 30802 since getting off topic here?
Comment 33 Krinkle 2012-09-18 13:38:29 UTC
(In reply to comment #32)
> @Krinkle: Thanks, but it would really be nice to have the error checking of
> CORS. I presume Roan knows what he is talking about, but if it is true what you
> say that the "API automatically puts itself in read-only anonymous user mode
> when accessing it through JSONP", then wouldn't this mode just need to be
> switched on in the case of cross-domain CORS?
> 

No, not at all. That would make cross-domain CORS pretty much useless.

The API allows trusted interaction through all modes except JSONP. So when one server communicates with another server from PHP, it will be possible to authenticate and do things.

And if two web sites communicate within the browser, it is also allowed, but only when both ends trust each other. Otherwise there would be a major security leak. Just imagine what would happen if someone would embed some javascript on a site somewhere that makes an AJAX request to the API to get a token and then edit a page. If you were to visit that other website (could be from a link in a chat application, Twitter, or e-mail etc.- could even be masked by a genuine-looking redirect) then the second you visit that other wise you'd suddenly (without you knowing) be making an edit on Wikipedia. Why? Because that AJAX request was made in your browser and you're still logged in, of course.

That's why
* JSON cross-origin requests are only allowed if both ends trust each other.
* JSONP requests are always allowed because they are unauthenticated.

You may wonder why its not possible to cheat. The reason is that JSON (not JSONP) can only be read if the XHR allows one to read the response. And one can't make an edit without a token, which can only be send if it was received first. So just making the request is not enough, it needs to be read and then send back. That is the security model basically.

JSONP on the other hand works with a callback, which means it is unrestricted. Any function form anywhere can be named and is then invoked.
Comment 34 Brett Zamir 2012-10-02 01:20:03 UTC
@Krinkle: Thanks, but I'm well familiar with JSONP itself, though I am not familiar with Mediawiki's implementation. I was simply suggesting that Mediawiki apply the same level of access to untrusted CORS as to JSONP. The error detection and security risk avoidance (particularly useful for non-Wikimedia sites) of CORS relative to JSONP would be a better choice, if not also for its slightly more streamlined API.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links