Last modified: 2007-07-17 11:45:50 UTC
This is a follow up of bug report 3159 (I erroneously commited as a bug report to the SpecialImport functionnality). Indeed the solution is in 'SpecialExport' and the problem is not in 'SpecialImport'. In order to facilitate the saving ox XML files, exported from mw, in an appropriate way (which is not an immediate operation even on good browsers like firefox, and even for not-too-stupid users...), we should rely on the Content-Disposition HTTP header (at least, optionaly). Everything, including the comments, are provided in my new version of file SpecialExport.php. (Just watch all lines around those including the pattern "// !" in this file) Or, of course, in the SpecialExport.diff file. I'll immediately upload these 2 files.
Created attachment 795 [details] the full SpecialExport.php file proposal (a diff file will also be uploaded)
Created attachment 796 [details] the diff file (-Naurb) for the proposed SpecialExport.php patch
This sounds like it would break the ability to view the XML within Mozilla/Firefox (which is very, very helpful for debugging, experimenting, and exploring the system). I'd prefer a fix in for the corruption bug in the offending browser(s). Can you narrow it down to a particular application, version, or setting?
Created attachment 799 [details] partial re-design of Special Export my re-design of SpecialExport.php + to take care of the content-disposition problem + Brion's remarks to allow viewing in a browser + very small bug fixed (marked // !FIXED) + restructuring the file (modifs are easely found by a grep on "// !" comments)
Created attachment 800 [details] screenshot (english) screenshot of the new design output (in english, however the browser is in french : firefox let you choice between viewing or downloading) (GNU/Linux Ubuntu 5.04 , mediawiki 1.5.104.4 (patch of 1.5beta4) , firefox 1.0.6)
Created attachment 801 [details] screenshot in french (GNU/Linux Ubuntu 5.04 , mediawiki 1.5.104.4 (patch of 1.5beta4) , firefox 1.0.6)
Created attachment 802 [details] diff -Naurb this diff file is rather large, (thus I also uploaded the final .php file)
Created attachment 803 [details] partial re-design of SpecialExport.php (removing a very old (?) phpDoc comment)
Created attachment 804 [details] diif -Naurb quite large (see php file)
Created attachment 805 [details] haDoc comments haDoc comments help the reader to see where a file was hacked and/or how the file is structured. this is NOT a substitute for phpDoc, nor for diff ! I join my haDoc file so that you may easely retrieve what has happened with the file. See http://meta.wikimedia.org/wiki/HaDoc to eventually read more about haDoc comments.
Created attachment 807 [details] diff -Naurb
Created attachment 808 [details] hacker documentation
Created attachment 809 [details] the SpecialExport.php including eveything (but using internally my conventions)
Created attachment 810 [details] just FYI : the script producing the haDoc (hacker documentation) file
This patch contains a lot of extraneous comments, config variables don't match convention, code style doesn't match convention, comment doesn't match phpdoc convention. Use of a float to control behavior is very unusual and not recommended. Usually when one submits a patch to an existing project, one attempts to match the code style of that code. One in particular doesn't rewrite the entire file in some totally different style, renaming all the variables and functions and changing all the indentation and whitespace -- it makes it harder to review, and would make maintenance of the code much much harder as well. It will absolutely not be accepted in this state; please rewrite following the existing coding style if you'd like others to actually be able to review the patch behind all the noise. Note that the xmldispositionfilename parameter appears to be placed into an HTTP header and HTML output without escaping; this may be a security risk.
(Please add wikibugs-l@wikipedia.org to the CC list when assigning a bug.)
Created attachment 814 [details] the patched SpecialExport.php file this is the full file. standard conventions are now strictly respected. However, the file also includes a few comments to the attention of developpers (they are clearly indicated as such) : these comments are intended to explain the choices and should be removed before distribution. I have removed my haDoc comments (which were not 'noise' ;-( ...) and my own naming conventions (underscore for private, lowercase for instance methods, uppercase for static) as requested. Nevertheless, WITHIN the newly created functions and methods, local variables are named according to the hungarian prefix convention (yes, I am a very very old computer engineer from the old C time and the first UNIX systems - and it's probably much too late to change me...). Because, this is only used WITHIN functions, nothing forbid to write a boolean $zCurOnly instead of $curonly.
Created attachment 815 [details] diff -Naurb for Defines.php a few constants must be made available for DefaultSettings, LocalSettings, and SpecialExport. Mind that I replaced my fuzzy parameters (real between 0 and 1) by a few unsigned short values.
Created attachment 816 [details] diff -Naurb for Language.php 2 prompt messages ha
Created attachment 817 [details] diff -Naurb for LanguageFr.php 2 prompt messages have been added + the submit button needs a specific message (the reason is explained as a comment in the SpecialExport.php file) Mind the diff files may not be very accurate regarding line numbers because I just extracted the lines related to SpecialExport (my Language.php and LanguageFr.php already include many other changes which are not related to SpecialExport)
Created attachment 818 [details] diff -Naurb for SpecialExport.php The essential is the content disposition management. Mind that the patch also include a few corrections for very minor bugs. These fixes are documented in the file.
Created attachment 819 [details] diff -Naurb for DefaultSettings.php again, line numbering may have been distrurbed by other patches...
Brion wrote : ''Note that the xmldispositionfilename parameter appears to be placed into an HTTP header and HTML output without escaping; this may be a security risk.'' The filename is not escaped because it is NOT the filename provided by the user. Instead, the filename *suggested* by the user is passed thru a function to generate a valid Unix file name. Unless I made an error, this function should return a value without risk. If I made an error, the function the solution will be to correct this function (but not to escape). Mind that this function (and her sister wfRemoveAccents) may eventually be moved to GlobalFunctions.php if one needs them to generate filenames elsewhere.
Created attachment 820 [details] diff -Naurb for LanguageFr.php (+ correction of an existing typo)
New patch still does not follow coding conventions. wfRemoveAccents() looks unhelpful and far from complete if it were desireable. wfUnixFileName() looks extremely unhelpful, and will make completely illegible useless names for many languages. Note that failing to escape output because at the moment it comes from a function which will produce 'safe' output is an unsafe practice. Many valid Unix filenames would produce JavaScript injections if output directly into HTML. None of this addresses the original question, either: what browsers are affected by this problem to begin with and can they be fixed?
*** Bug 10603 has been marked as a duplicate of this bug. ***
Here's my patch: --- SpecialExport-0.php 2007-07-16 18:11:59.000000000 +0200 +++ SpecialExport.php 2007-07-16 18:10:22.000000000 +0200 @@ -151,6 +151,7 @@ function wfSpecialExport( $page = '' ) { // This should provide safer streaming for pages with history wfResetOutputBuffers(); header( "Content-type: application/xml; charset=utf-8"); + header("Content-Disposition: attachment" ); $pages = explode( "\n", $page ); $db = wfGetDB( DB_SLAVE );
Fixed in r24172. I didn't use the patches above.
verified. Thanks a lot !