Last modified: 2014-09-23 23:08:58 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T28059, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 26059 - Add support for KML/KMZ filetype
Add support for KML/KMZ filetype
Status: NEW
Product: MediaWiki
Classification: Unclassified
Uploading (Other open bugs)
1.17.x
All All
: Normal enhancement with 4 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch, patch-reviewed
Depends on:
Blocks: multimedia
  Show dependency treegraph
 
Reported: 2010-11-22 17:46 UTC by Jeroen De Dauw
Modified: 2014-09-23 23:08 UTC (History)
23 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
patch for kml support (3.13 KB, patch)
2011-02-12 14:30 UTC, Derk-Jan Hartman
Details

Description Jeroen De Dauw 2010-11-22 17:46:29 UTC
When I try to upload a valid KML file I get "File extension does not match MIME type."
Comment 1 Mark A. Hershberger 2011-02-12 04:28:14 UTC
I imagine this is because KML mime types aren't included in includes/mime.info or includes/mime.types.

See http://code.google.com/apis/kml/documentation/kml_tut.html#kml_server

Care to add them?
Comment 2 Derk-Jan Hartman 2011-02-12 14:30:46 UTC
Created attachment 8132 [details]
patch for kml support

I figured adding kml support would be a breeze, but I had not counted on the brain dead browser that is IE6.

Unfortunately, kml contains the element <heading, which triggers the protection in detectScript() that protects from uploads that IE6 might mistake for HTML. It triggers on "<head" not sure if we can work around this, but Tim will know.
Comment 3 Derk-Jan Hartman 2011-02-14 22:59:52 UTC
Note to self, mimetype sniffing of Safari:

oldest: http://trac.webkit.org/browser/trunk/WebKit/Misc.subproj/WebNSDataExtras.m?rev=9259
newest: http://trac.webkit.org/browser/trunk/Source/WebKit/mac/Misc/WebNSDataExtras.m?rev=75909

Apparently no issue with head for safari. But i'm guessing IEContentAnalyser would still block the file as well.


Other note; website with mime signature detection of other browsers: http://webblaze.cs.berkeley.edu/2009/content-sniffing/
Comment 4 Mark A. Hershberger 2011-02-19 03:28:04 UTC
Applied at r82436.

I just checked and realized you have commit access: is there a reason you didn't commit this yourself?
Comment 5 Derk-Jan Hartman 2011-02-19 19:04:34 UTC
@Mark yeah i didn't commit it yet, because it functionally doesn't work :D
Comment 6 Mark A. Hershberger 2011-02-19 20:32:29 UTC
Understood.  Just trying to make sure the work that is done so far gets included.
Comment 7 p858snake 2011-04-30 00:09:12 UTC
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
Comment 8 Jeroen De Dauw 2011-05-15 11:59:30 UTC
Any progress on this?
Comment 9 Bryan Tong Minh 2011-05-15 12:27:22 UTC
No.
Comment 10 Renate 2011-11-01 02:13:53 UTC
There are still times that you want to serve a KMZ from a MW but it never get near a browser. Who cares what IE thinks? For instance, we use:

http://maps.google.com/maps?q=http%3A%2F%2Fwiki.bogus.com%2Fimages%2F8%2F82%2FBogus.kmz

It's Google that ends up uploading the KMZ.

All you need to do is add kmz to the application/zip line in includes/mime.types
Comment 11 Sumana Harihareswara 2011-11-09 03:54:11 UTC
r91109 was a partial revert.  A finished patch would be appreciated, if someone has time and interest -- Renate?
Comment 12 Derk-Jan Hartman 2011-11-19 12:41:45 UTC
I don't think many of the folks here understand the complexity of this problem. The IE6 filter is part of our security system. This is simply one file format which we cannot 'just add'. If it were as simple as that, I had made my commit.

Fixing this requires extensive knowledge about the way browsers do content sniffing and touches on the security aspects of MediaWiki. This makes the pool of developers that can actually fix this rather small. And that is BEFORE we start touching on the subject of if supporting this file is even possible at all while keeping the same security context as we have at the moment.
Comment 13 Derk-Jan Hartman 2011-11-19 12:45:00 UTC
Actually, with our new improved zip parser, kmz might actually be possible to add at this point in time.

kml will still have the same issue as before. The files will trip the IE6 content sniffing filters that are in place to protect IE6 users.
Comment 14 Derk-Jan Hartman 2011-11-19 13:13:20 UTC
Another P.S. in my patch I addd a new mediatype called DATA. If anyone wants to work on this again, we need to update the table creation scripts to be able to recognize that new constant value.
Comment 15 Jess O'Neill 2012-02-23 14:03:16 UTC
I'm guessing no additional attempts have gone into this since November. At the English Wikipedia a group of editors have discovered a very good use for KML files in representing linear features on google/bing maps. Currently the text of the KML is posted to a talk page and run from there.

Can the software not be set to treat kml as a raw text file, rather than attempting to parse it as html? We don't need to run the file, we just need a more convenient way of adding them than copying and pasting the contents into a subpage.
Comment 16 Bryan Tong Minh 2012-02-24 12:03:06 UTC
(In reply to comment #15)
> I'm guessing no additional attempts have gone into this since November. At the
> English Wikipedia a group of editors have discovered a very good use for KML
> files in representing linear features on google/bing maps. Currently the text
> of the KML is posted to a talk page and run from there.
> 
> Can the software not be set to treat kml as a raw text file, rather than
> attempting to parse it as html? We don't need to run the file, we just need a
> more convenient way of adding them than copying and pasting the contents into a
> subpage.

The problem is not whether MediaWiki interprets the KML file as HTML or not, but the fact that certain broken browsers will treat the KML files as HTML, opening a whole lot of security vulnerabilities.

As DJ said, KMZ could be acceptable though.
Comment 17 Jess O'Neill 2012-02-24 15:01:10 UTC
Thank you Microsoft for continuing to poison the internet. There's no way to get the browser to treat it as a comment or a piece of raw text instead of it trying to parse it as html? Will this change as IE6 moves towards 1% of the browser market and websites move away from compatibility with it?

I'm just worried that the kmz thing adds another step to what some are calling complicated as is, and takes away a few of the great aspects of kml (being able to extract the coords, manipulating the precision with a bot, etc)... But if its the only solution we can pull off then at least it's a start.
Comment 18 Daniel Schwen 2012-02-24 23:28:49 UTC
Pardon my ignorance, but couldn't the IE6 filter just let '<heading>' pass and only block all other variants of '<head*' ?!
Side note: some sites already see IE6 below one percent. When can we stop letting IE6 tie down progress?
Comment 19 Derk-Jan Hartman 2012-02-25 20:04:41 UTC
(In reply to comment #18)
> Pardon my ignorance, but couldn't the IE6 filter just let '<heading>' pass and
> only block all other variants of '<head*' ?!

I'm quite sure Tim or I looked at that, but IE6 itself seems to specifically looks for <head* , and that is the behavior that the filter needs to (and does) match in order to protect the IE6 users.
Comment 20 Jess O'Neill 2012-02-25 20:29:43 UTC
Just also a note, that Google Earth produced KMLs do not contain the <heading> element... I believe it is completely optional. Unfortunately I don't know how these filters work or what is triggering what precisely... Is it that IE6 users wouldn't be able to upload it, or that they wouldn't be able to view the File: page of a KML, or some other thing? 


http://www.ie6countdown.com/ has a good statistical overview of IE6 usage worldwide.
Comment 21 Derk-Jan Hartman 2012-02-25 20:37:46 UTC
The problem is that IE6 will treat everything that has <head in the first several bytes as HTML. That means that if someone uploads a specifically crafted KML file and an unsuspecting IE6 user downloads it, that the machine of the IE6 user can be compromised. Our IE6 content filter protects against the uploading of any content that would trigger any of the 15 or so strings that will convince IE6 that something is HTML, even though it isn't. So uploading a JPEG with <head in the EXIF at the start of the file, would also trigger the filter and not allow you to upload that file. Unfortunately KML will always trigger the filter.
Comment 22 Richard Guk 2012-02-25 21:52:04 UTC
But what if there were no <heading> tags, or none in the first 1KB?

To add to my confusion, there's already an optional exemption in UploadBase.php to allow SVG files containing "<title" (which is an IE sniff tag)
- http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/upload/UploadBase.php?revision=111103&view=markup

On the other hand, UploadBase.php does not test for other sniff tags (such as "<plaintext" for IE7, which presumably applies also to IE6) are not tested for 
- http://webblaze.cs.berkeley.edu/2009/content-sniffing/

Does this mean that the Berkeley list is too broad, or that there is a potential vulnerability in MediaWiki?
Comment 23 Jess O'Neill 2012-04-28 17:45:11 UTC
According to w3s, Internet explorer 6 is now below 1.0% of internet usage as of March 2012. Can this filter be removed, or reversed so that IE6 users are blocked from viewing certain file types. Its rediculous to inconvenience ourselves and to halt progress for 0.9% of the internet, 25% of which comes from China and can't even view most of Wikipedia

http://www.w3schools.com/browsers/browsers_explorer.asp
Comment 24 Rd232 2012-08-23 15:25:18 UTC
Is there no way to simply block IE6 users from viewing or downloading KML files, so security for them is not an issue? If we can't serve that 1% of users (or it's too tricky to do so safely), fine, what about the rest of us?
Comment 25 Jess O'Neill 2012-10-02 20:00:36 UTC
If a reply isn't posted from a developer within a week to move forward with this, I'm opening a new bug report until action is taken. This delay is retarded. It is absolutely backwards development to restrict the advancement of 99% of the internet for the 1% of laggers who would still use a text based browser if that was what came with their default Windows installation. Just block IE 6 and force people to upgrade; problem solved, KML can be enabled, and we can move forwards with this capability.

If not, then who is the head honcho deciding that that group of thumb twiddlers deserves to be catered to?
Comment 26 James Alexander 2012-10-07 21:37:41 UTC
Opening a new ticket until action is taken isn't going to help anything move faster (it may actually make it move slower). Someone was asking on IRC about this and so I'm adding Chris Steipp as well since it's a security issue and he might have an idea on the risks involved and what we can do. I do not know of any place where we've blocked a specific browser from using a specific type of file but that doesn't mean it's out of the bounds of possibility. I certainly think this is a file that we want to make possible if we can.

That said, your comments here Jess are unacceptable. This is not a place to throw around insults.

Yes, we try to support anything that has over 1% of our page hits. IE6 has just over 2% for actual html page requests right now.  You're right, a large portion of IE6 users are still in China but you're wrong that most of them can't view Wikipedia. Yes we're blocked occasionally but usually they can get to most of the pages. Our mission is to spread this free knowledge to as many people as possible, it would be totally unacceptable to just say "you're not welcome here because of your browser" especially at levels that high. 2% of requests i still millions of page requests. That is especially true when most people who are using IE6 do not get to make that choice themselves or are in areas of the world where you especially want to get information.

Pushing to try and get a bug resolved is completely ok, insulting specific groups of users or trying to flood the bug channels is not. Please tone your rhetoric down.

(Said as a personal community member/admin and not as a staff member)
Comment 27 Chris Steipp 2012-10-09 17:18:01 UTC
Thanks for adding me James. This is the first I had heard of it.

Unfortunately, kml is a fairly complex and feature rich format, so it would really need it's own special parsing, similar to the svg format. Beyond just IE6 sniffing and running javascript in the kml, there is also the issue that kmls can embed javascript that the plugin will execute, and link to external resources that could be used to track our users. So at minimum, a solution would need to do pretty extensive script/css filtering, and remove anything that looked like a link to an external resource.
Comment 28 Bawolff (Brian Wolff) 2012-11-05 21:26:16 UTC
Can I get a hip hip hurrah for google not checking mime types... </sarcasm>.


-----

By a brief look at template:Attached_KML, it seems that the templates only use a small portion of the KML standard. It may perhaps be less work to do a custom tag (easytimeline style) where we generate a safe kml file from a simpler language for specifying coordinates to highlight on the map.

The downside is obviously that in the future people might want more features from their kml.

----

>But what if there were no <heading> tags, or none in the first 1KB?

That would take care of the IE6 issue, but as Chris mentions there are other concerns, in particular allowing third parties to track the ip's of our users.
Comment 29 Bawolff (Brian Wolff) 2012-11-05 21:42:41 UTC
(In reply to comment #28)
> Can I get a hip hip hurrah for google not checking mime types... </sarcasm>.

As an aside, according to google maps docs, "HTML content is allowed but is sanitized to protect from cross-browser attacks", which makes google maps not checking mime types (and hence the Wikipedian's inline kml file hack) much less scary... :D
Comment 30 Richard Guk 2012-11-05 23:55:49 UTC
Isn't the new ContentHandler designed to handle non-wikitext article "paradata" such as KML/KMZ?

Though discussion here had been dormant for a while, I had assumed that it was exactly the kind of case which the new handler would enable.

(In reply to comment #28)
> By a brief look at template:Attached_KML, it seems that the templates only use
> a small portion of the KML standard. It may perhaps be less work to do a custom
> tag (easytimeline style) where we generate a safe kml file from a simpler
> language for specifying coordinates to highlight on the map.
> 
> The downside is obviously that in the future people might want more features
> from their kml.

A sanitised subset is exactly what is sought and required. Wikitext and SVG are already subject to tag whitelisting, which is what KML needs.

> >But what if there were no <heading> tags, or none in the first 1KB?
> That would take care of the IE6 issue, but as Chris mentions there are other
> concerns, in particular allowing third parties to track the ip's of our users.

Once external resource requests are filtered (as with SVG files), there is no more privacy leakage than there would be with a plain external URL in an article's wikitext. Google Maps just downloads the raw content of the specified subpage if a reader clicks the Attached KML link.
Comment 31 Quim Gil 2013-03-25 04:27:56 UTC
(As suggested by Bawolff at http://www.mediawiki.org/wiki/Talk:Mentorship_programs/Possible_projects#GSOC_2013_candidates_missing_one_thing_or_two_25493 )

Do you think the development of this feature is suitable for a Google Summer of Code project? If you think this make sense then we would need a short description of the project published at http://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects and at least one mentor.

From there we would publish it at https://www.mediawiki.org/wiki/Summer_of_Code_2013#Project_ideas
Comment 32 Quim Gil 2013-10-30 21:55:39 UTC
This proposal is now featured at https://www.mediawiki.org/wiki/Outreach_Program_for_Women/Round_7
Comment 33 Fabrice Florin 2013-12-18 19:33:03 UTC
Moved to Normal, because we do not view this as high priority at this time.
Comment 34 rschen7754.wiki 2014-08-09 22:41:15 UTC
Well, Erik Moller has just sent out an email saying that JS on IE6 will be disabled completely with 1.24wm17.

Does this mean that something could be done with this bug? Or would it be better to shift efforts to Wikidata? (or both, and just have Wikidata link to files on Commons?)
Comment 35 Bawolff (Brian Wolff) 2014-08-10 07:57:14 UTC
Well something could have always been done, just noone has been willing to spend the time to do it.

The js announcement doesnt affect the security issues mentioned above.
Comment 36 Jeroen De Dauw 2014-08-10 14:45:23 UTC
The use cases for which I opened the bug are not helped in any way by Wikidata. They also have nothing to do with the WMF. So having Wikidata does not help.
Comment 37 rschen7754.wiki 2014-08-10 16:24:01 UTC
Joeroen De Dauw: see bug 55549, which would solve the big-picture problem with Wikidata.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links