Last modified: 2013-04-16 06:09:10 UTC
Reported by Daniel Franke. Verified this on my dev system (recent git pull) with the poc svg. >>>> Hello Wikimedia security folks: I've just discovered a vulnerability in Mediawiki's SVG metadata extraction feature which enables an attacker to achieve remote file inclusion and, in some cases, remote code execution. I've developed a proof-of-concept exploit against mediawiki-1.19.4 as distributed with Debian Wheezy, but suspect that the same exploit will work against all recent versions. The nature of the vulnerability is that the XMLReader instance used in SVGMetadataExtractor.php performs expansion of XML external entities (XXEs). As a result, if an attacker uploads an SVG file such as the following: <!DOCTYPE svg [ <!ENTITY foo SYSTEM "file:///etc/passwd"> ]> <svg xmlns="http://www.w3.org/2000/svg" version="1.1"> <desc>&foo;</desc> <rect width="300" height="100" style="fill:rgb(0,0,255);stroke-width:1;stroke:rgb(0,0,0)" /> </svg> then Mediawiki will read from /etc/passwd and expose its contents in the 'Metadata' section of the image page created as a result of the upload. If PHP's 'expect' extension is enabled, the same technique can be used to achieve remote code execution by giving an expect:// URL as the system identifier for the external entity. I've attached a screenshot demonstrating remote code execution, having uploaded an SVG file like the one above, but with "expect://id" replacing "file:///etc/passwd". There may possibly be other exploit vectors with weaker preconditions, but the following conditions are necessary in order for this particular exploit to succeed: * File upload must be enabled. * $wgFileExtensions[] must include 'svg'. * $wgSVGConverter must be set to something other than 'false'. * To directly achieve remote code execution, PHP's 'expect' extension (http://pecl.php.net/package/expect) must be installed and enabled. You should be able to close this vulnerability by calling 'libxml_disable_entity_loader()' prior to doing any parsing. Loading external entities is almost never desirable, so I suggest doing this globally, not just from SVGMetadataExtractor.php. For more information on XXE vulnerabilities in general, see http://www.securityfocus.com/archive/1/297714.
Created attachment 12034 [details] Screenshot showing RCE
Hooray for insecure defaults in low-level libraries. Sigh.
Created attachment 12035 [details] Wrap SVGReader::read() in libxml_disable_entity_loader() Proposed fix. Calling libxml_disable_entity_loader() breaks the environment pretty badly: XMLReader::open() can't even read its input file once it is done, since this "entity loader" is actually a hook into libxml's generic file open function. So it is best to re-enable it after we are done. Note that it is the use of SUBST_ENTITIES which breaks PHP's "secure by default" policy. We call libxml_disable_entity_loader() after XMLReader::open(), so that that will still work, but before security is broken by SUBST_ENTITIES. Make SVGReader::read() protected so that nothing can call it without passing through the wrapper. Nothing else calls it anyway, and this is slightly more convenient than catching exceptions in read() itself.
> Note that it is the use of SUBST_ENTITIES which breaks PHP's "secure by default" policy. Actually, that isn't quite the case. If you inspect the libxml source code (grep for the word "absurd" in parser.c), you'll see that even if entity expansion is not requested, it will still fetch and expand external entities that are referenced within XML attributes in order to make sure that their content is well-formed. So, although SUBST_ENTITIES is what's causing you to leak the contents of files, even without that you'd have the RCE issue from expect:// URLs, as well as denial-of-service by reading endlessly from /dev/random.
(In reply to comment #4) > > Note that it is the use of SUBST_ENTITIES which breaks PHP's "secure by > default" policy. > > Actually, that isn't quite the case. If you inspect the libxml source code > (grep for the word "absurd" in parser.c), you'll see that even if entity > expansion is not requested, it will still fetch and expand external entities > that are referenced within XML attributes in order to make sure that their > content is well-formed. So, although SUBST_ENTITIES is what's causing you to > leak the contents of files, even without that you'd have the RCE issue from > expect:// URLs, as well as denial-of-service by reading endlessly from > /dev/random. For me, it just gives an error in that case: Warning: XMLReader::read(): /home/tstarling/src/libxml2/git/:1: parser error : Attribute references external entity 'foo' in php shell code on line 1 which comes from xmlParseEntityRef: else if ((ctxt->instate == XML_PARSER_ATTRIBUTE_VALUE) && (ent->etype == XML_EXTERNAL_GENERAL_PARSED_ENTITY)) { xmlFatalErrMsgStr(ctxt, XML_ERR_ENTITY_IS_EXTERNAL, "Attribute references external entity '%s'\n", name); } And note that the ctxt->instate is unconditionally set to XML_PARSER_ATTRIBUTE_VALUE before the "absurd" case you mentioned is reached. Maybe this is one of the "entity problems" they are trying to detect by calling xmlStringDecodeEntities() and throwing away the result. gdb confirms that the hook is not called. My test case is: <!DOCTYPE svg [ <!ENTITY foo SYSTEM "file:///etc/passwd"> ]> <elt attr="&foo;"/> Either way, I'm not sure how much can be done upstream. It might not be as simple as adding a protocol whitelist to php_libxml_streams_IO_open_wrapper(). That would break our own WikiImporter class: stream_wrapper_register( 'uploadsource', 'UploadSourceAdapter' ); ... $this->reader->open( "uploadsource://$id" ); I guess a big warning box on http://php.net/xmlreader.setparserproperty would be better than nothing.
Ah yes, you're correct. I've used the attribute well-formedness check in other attacks against libxml, but I was misremembering its exact behavior: it only expands internal entities, not external ones. I've used this trick, e.g., to exploit CVE-2011-3919, but it's not useful here.
(In reply to comment #5) > Either way, I'm not sure how much can be done upstream. It might not be as > simple as adding a protocol whitelist to > php_libxml_streams_IO_open_wrapper(). I've asked Rob Richards about this by email. He worked with me on a couple of previous ext/libxml issues.
Oh... but libxml *will* fetch external entities in the normal case where they appear as text nodes. $ cat foo.xml <!DOCTYPE a [<!ENTITY foo SYSTEM "/etc/passwd">]> <a>&foo;</a> Here no entity expansion is requested: $ xmllint foo.xml <?xml version="1.0"?> <!DOCTYPE a [ <!ENTITY foo SYSTEM "/etc/passwd"> ]> <a>&foo;</a> But let's see what strace says: dfranke@roostercomb:~$ strace xmllint foo.xml ... stat("/etc/passwd", {st_mode=S_IFREG|0644, st_size=1826, ...}) = 0 stat("/etc/passwd", {st_mode=S_IFREG|0644, st_size=1826, ...}) = 0 stat("/etc/passwd", {st_mode=S_IFREG|0644, st_size=1826, ...}) = 0 open("/etc/passwd", O_RDONLY) = 4 read(4, "root:x:0:0:root:/root:/bin/bash\n"..., 8192) = 1826 ...but anyhow, I'm convinced that your patch fixes the vulnerability that I originally reported, which is XXE in SVGMetadataExtractor. I'm happy to keep discussing the potential for XXE elsewhere in Mediawiki or in PHP or libxml generally, but we should take that discussion to email.
From stracing a few scenarios, it looks like even without SUBST_ENTITIES, the files do get opened when XMLReader::read() is called, but not before then. With Tim's patch, I'm confirming (with strace) that the files reference in the entities aren't being opened. SVG's without external entities are parsed without a problem as well. I think it's ready to deploy if we want to.
Just a quick update. The cluster was patched yesterday, and this patch is driving the release of 1.20.4. I think we may need to patch a couple more parts of our code, so I think we'll hold off on the 1.20.4 release until next week, and I'm fairly confident we've closed all the places this can be triggered.
Related URL: https://gerrit.wikimedia.org/r/59198 (Gerrit Change I0b2ef270f15c7b4da17edee680bf7e2410919915)
Related URL: https://gerrit.wikimedia.org/r/59340 (Gerrit Change I0b2ef270f15c7b4da17edee680bf7e2410919915)
Related URL: https://gerrit.wikimedia.org/r/59346 (Gerrit Change I0b2ef270f15c7b4da17edee680bf7e2410919915)
Related URL: https://gerrit.wikimedia.org/r/59356 (Gerrit Change I0b2ef270f15c7b4da17edee680bf7e2410919915)
Related URL: https://gerrit.wikimedia.org/r/59376 (Gerrit Change I0b2ef270f15c7b4da17edee680bf7e2410919915)