Last modified: 2014-09-23 23:08:58 UTC
When I try to upload a valid KML file I get "File extension does not match MIME type."
I imagine this is because KML mime types aren't included in includes/mime.info or includes/mime.types. See http://code.google.com/apis/kml/documentation/kml_tut.html#kml_server Care to add them?
Created attachment 8132 [details] patch for kml support I figured adding kml support would be a breeze, but I had not counted on the brain dead browser that is IE6. Unfortunately, kml contains the element <heading, which triggers the protection in detectScript() that protects from uploads that IE6 might mistake for HTML. It triggers on "<head" not sure if we can work around this, but Tim will know.
Note to self, mimetype sniffing of Safari: oldest: http://trac.webkit.org/browser/trunk/WebKit/Misc.subproj/WebNSDataExtras.m?rev=9259 newest: http://trac.webkit.org/browser/trunk/Source/WebKit/mac/Misc/WebNSDataExtras.m?rev=75909 Apparently no issue with head for safari. But i'm guessing IEContentAnalyser would still block the file as well. Other note; website with mime signature detection of other browsers: http://webblaze.cs.berkeley.edu/2009/content-sniffing/
Applied at r82436. I just checked and realized you have commit access: is there a reason you didn't commit this yourself?
@Mark yeah i didn't commit it yet, because it functionally doesn't work :D
Understood. Just trying to make sure the work that is done so far gets included.
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
Any progress on this?
No.
There are still times that you want to serve a KMZ from a MW but it never get near a browser. Who cares what IE thinks? For instance, we use: http://maps.google.com/maps?q=http%3A%2F%2Fwiki.bogus.com%2Fimages%2F8%2F82%2FBogus.kmz It's Google that ends up uploading the KMZ. All you need to do is add kmz to the application/zip line in includes/mime.types
r91109 was a partial revert. A finished patch would be appreciated, if someone has time and interest -- Renate?
I don't think many of the folks here understand the complexity of this problem. The IE6 filter is part of our security system. This is simply one file format which we cannot 'just add'. If it were as simple as that, I had made my commit. Fixing this requires extensive knowledge about the way browsers do content sniffing and touches on the security aspects of MediaWiki. This makes the pool of developers that can actually fix this rather small. And that is BEFORE we start touching on the subject of if supporting this file is even possible at all while keeping the same security context as we have at the moment.
Actually, with our new improved zip parser, kmz might actually be possible to add at this point in time. kml will still have the same issue as before. The files will trip the IE6 content sniffing filters that are in place to protect IE6 users.
Another P.S. in my patch I addd a new mediatype called DATA. If anyone wants to work on this again, we need to update the table creation scripts to be able to recognize that new constant value.
I'm guessing no additional attempts have gone into this since November. At the English Wikipedia a group of editors have discovered a very good use for KML files in representing linear features on google/bing maps. Currently the text of the KML is posted to a talk page and run from there. Can the software not be set to treat kml as a raw text file, rather than attempting to parse it as html? We don't need to run the file, we just need a more convenient way of adding them than copying and pasting the contents into a subpage.
(In reply to comment #15) > I'm guessing no additional attempts have gone into this since November. At the > English Wikipedia a group of editors have discovered a very good use for KML > files in representing linear features on google/bing maps. Currently the text > of the KML is posted to a talk page and run from there. > > Can the software not be set to treat kml as a raw text file, rather than > attempting to parse it as html? We don't need to run the file, we just need a > more convenient way of adding them than copying and pasting the contents into a > subpage. The problem is not whether MediaWiki interprets the KML file as HTML or not, but the fact that certain broken browsers will treat the KML files as HTML, opening a whole lot of security vulnerabilities. As DJ said, KMZ could be acceptable though.
Thank you Microsoft for continuing to poison the internet. There's no way to get the browser to treat it as a comment or a piece of raw text instead of it trying to parse it as html? Will this change as IE6 moves towards 1% of the browser market and websites move away from compatibility with it? I'm just worried that the kmz thing adds another step to what some are calling complicated as is, and takes away a few of the great aspects of kml (being able to extract the coords, manipulating the precision with a bot, etc)... But if its the only solution we can pull off then at least it's a start.
Pardon my ignorance, but couldn't the IE6 filter just let '<heading>' pass and only block all other variants of '<head*' ?! Side note: some sites already see IE6 below one percent. When can we stop letting IE6 tie down progress?
(In reply to comment #18) > Pardon my ignorance, but couldn't the IE6 filter just let '<heading>' pass and > only block all other variants of '<head*' ?! I'm quite sure Tim or I looked at that, but IE6 itself seems to specifically looks for <head* , and that is the behavior that the filter needs to (and does) match in order to protect the IE6 users.
Just also a note, that Google Earth produced KMLs do not contain the <heading> element... I believe it is completely optional. Unfortunately I don't know how these filters work or what is triggering what precisely... Is it that IE6 users wouldn't be able to upload it, or that they wouldn't be able to view the File: page of a KML, or some other thing? http://www.ie6countdown.com/ has a good statistical overview of IE6 usage worldwide.
The problem is that IE6 will treat everything that has <head in the first several bytes as HTML. That means that if someone uploads a specifically crafted KML file and an unsuspecting IE6 user downloads it, that the machine of the IE6 user can be compromised. Our IE6 content filter protects against the uploading of any content that would trigger any of the 15 or so strings that will convince IE6 that something is HTML, even though it isn't. So uploading a JPEG with <head in the EXIF at the start of the file, would also trigger the filter and not allow you to upload that file. Unfortunately KML will always trigger the filter.
But what if there were no <heading> tags, or none in the first 1KB? To add to my confusion, there's already an optional exemption in UploadBase.php to allow SVG files containing "<title" (which is an IE sniff tag) - http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/upload/UploadBase.php?revision=111103&view=markup On the other hand, UploadBase.php does not test for other sniff tags (such as "<plaintext" for IE7, which presumably applies also to IE6) are not tested for - http://webblaze.cs.berkeley.edu/2009/content-sniffing/ Does this mean that the Berkeley list is too broad, or that there is a potential vulnerability in MediaWiki?
According to w3s, Internet explorer 6 is now below 1.0% of internet usage as of March 2012. Can this filter be removed, or reversed so that IE6 users are blocked from viewing certain file types. Its rediculous to inconvenience ourselves and to halt progress for 0.9% of the internet, 25% of which comes from China and can't even view most of Wikipedia http://www.w3schools.com/browsers/browsers_explorer.asp
Is there no way to simply block IE6 users from viewing or downloading KML files, so security for them is not an issue? If we can't serve that 1% of users (or it's too tricky to do so safely), fine, what about the rest of us?
If a reply isn't posted from a developer within a week to move forward with this, I'm opening a new bug report until action is taken. This delay is retarded. It is absolutely backwards development to restrict the advancement of 99% of the internet for the 1% of laggers who would still use a text based browser if that was what came with their default Windows installation. Just block IE 6 and force people to upgrade; problem solved, KML can be enabled, and we can move forwards with this capability. If not, then who is the head honcho deciding that that group of thumb twiddlers deserves to be catered to?
Opening a new ticket until action is taken isn't going to help anything move faster (it may actually make it move slower). Someone was asking on IRC about this and so I'm adding Chris Steipp as well since it's a security issue and he might have an idea on the risks involved and what we can do. I do not know of any place where we've blocked a specific browser from using a specific type of file but that doesn't mean it's out of the bounds of possibility. I certainly think this is a file that we want to make possible if we can. That said, your comments here Jess are unacceptable. This is not a place to throw around insults. Yes, we try to support anything that has over 1% of our page hits. IE6 has just over 2% for actual html page requests right now. You're right, a large portion of IE6 users are still in China but you're wrong that most of them can't view Wikipedia. Yes we're blocked occasionally but usually they can get to most of the pages. Our mission is to spread this free knowledge to as many people as possible, it would be totally unacceptable to just say "you're not welcome here because of your browser" especially at levels that high. 2% of requests i still millions of page requests. That is especially true when most people who are using IE6 do not get to make that choice themselves or are in areas of the world where you especially want to get information. Pushing to try and get a bug resolved is completely ok, insulting specific groups of users or trying to flood the bug channels is not. Please tone your rhetoric down. (Said as a personal community member/admin and not as a staff member)
Thanks for adding me James. This is the first I had heard of it. Unfortunately, kml is a fairly complex and feature rich format, so it would really need it's own special parsing, similar to the svg format. Beyond just IE6 sniffing and running javascript in the kml, there is also the issue that kmls can embed javascript that the plugin will execute, and link to external resources that could be used to track our users. So at minimum, a solution would need to do pretty extensive script/css filtering, and remove anything that looked like a link to an external resource.
Can I get a hip hip hurrah for google not checking mime types... </sarcasm>. ----- By a brief look at template:Attached_KML, it seems that the templates only use a small portion of the KML standard. It may perhaps be less work to do a custom tag (easytimeline style) where we generate a safe kml file from a simpler language for specifying coordinates to highlight on the map. The downside is obviously that in the future people might want more features from their kml. ---- >But what if there were no <heading> tags, or none in the first 1KB? That would take care of the IE6 issue, but as Chris mentions there are other concerns, in particular allowing third parties to track the ip's of our users.
(In reply to comment #28) > Can I get a hip hip hurrah for google not checking mime types... </sarcasm>. As an aside, according to google maps docs, "HTML content is allowed but is sanitized to protect from cross-browser attacks", which makes google maps not checking mime types (and hence the Wikipedian's inline kml file hack) much less scary... :D
Isn't the new ContentHandler designed to handle non-wikitext article "paradata" such as KML/KMZ? Though discussion here had been dormant for a while, I had assumed that it was exactly the kind of case which the new handler would enable. (In reply to comment #28) > By a brief look at template:Attached_KML, it seems that the templates only use > a small portion of the KML standard. It may perhaps be less work to do a custom > tag (easytimeline style) where we generate a safe kml file from a simpler > language for specifying coordinates to highlight on the map. > > The downside is obviously that in the future people might want more features > from their kml. A sanitised subset is exactly what is sought and required. Wikitext and SVG are already subject to tag whitelisting, which is what KML needs. > >But what if there were no <heading> tags, or none in the first 1KB? > That would take care of the IE6 issue, but as Chris mentions there are other > concerns, in particular allowing third parties to track the ip's of our users. Once external resource requests are filtered (as with SVG files), there is no more privacy leakage than there would be with a plain external URL in an article's wikitext. Google Maps just downloads the raw content of the specified subpage if a reader clicks the Attached KML link.
(As suggested by Bawolff at http://www.mediawiki.org/wiki/Talk:Mentorship_programs/Possible_projects#GSOC_2013_candidates_missing_one_thing_or_two_25493 ) Do you think the development of this feature is suitable for a Google Summer of Code project? If you think this make sense then we would need a short description of the project published at http://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects and at least one mentor. From there we would publish it at https://www.mediawiki.org/wiki/Summer_of_Code_2013#Project_ideas
This proposal is now featured at https://www.mediawiki.org/wiki/Outreach_Program_for_Women/Round_7
Moved to Normal, because we do not view this as high priority at this time.
Well, Erik Moller has just sent out an email saying that JS on IE6 will be disabled completely with 1.24wm17. Does this mean that something could be done with this bug? Or would it be better to shift efforts to Wikidata? (or both, and just have Wikidata link to files on Commons?)
Well something could have always been done, just noone has been willing to spend the time to do it. The js announcement doesnt affect the security issues mentioned above.
The use cases for which I opened the bug are not helped in any way by Wikidata. They also have nothing to do with the WMF. So having Wikidata does not help.
Joeroen De Dauw: see bug 55549, which would solve the big-picture problem with Wikidata.