Last modified: 2014-05-08 22:27:06 UTC
The Special:Export page generates an XML document without an XML prolog. The HTTP header Content-Type is set as "application/xml;charset=UTF-8", so the encoding is known at request time, but if the result is persisted, a prolog must be added. Duplicating the encoding information in the HTTP headers and content isn't unique to XML; it's a suggested practice for HTML to set the HTTP header and meta element with the content type in the HTML document. Additionally, adding the prolog will make the XML version explicit.
The prolog would just be as follows.
<?xml version="1.0" encoding="UTF-8"?>
IIRC, a prologue is only required if the encoding is *not* UTF-8. Since the encoding *is* UTF-8, which is XML's default, it is not needed.
(In reply to comment #1)
> IIRC, a prologue is only required if the encoding is *not* UTF-8. Since the
> encoding *is* UTF-8, which is XML's default, it is not needed.
XML 1.0 says "SHOULD" - http://www.w3.org/TR/2006/REC-xml-20060816/#sec-prolog-dtd.
XML 1.1 says "MUST" - http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-prolog-dtd.
*** Bug 65031 has been marked as a duplicate of this bug. ***
This bug is causing the API response to also not include the XML prolog.
The impression that I get is that nobody really uses XML 1.1; our output is XML 1.0.
"The main changes are to enable the use of line-ending characters used on EBCDIC platforms, and the use of scripts and characters absent from Unicode 3.2. XML 1.1 is not very widely implemented and is recommended for use only by those who need its unique features."
"Everything you need to know about XML 1.1 can be summed up in two rules:
Don't use it.
(For experts only) If you speak Mongolian, Yi, Cambodian, Amharic, Dhivehi, Burmese or a very few other languages and you want to write your markup (not your text but your markup) in these languages, then you can set the version attribute of the XML declaration to 1.1. Otherwise, refer to rule 1."
So? be explicit that it isnt being used. Then you comply with strict XML 1.1 parsers, and the robustness principle.
If the argument is that it could break existing consumers, that would be a valid argument and is something that could be explored.