Last modified: 2008-03-13 19:55:19 UTC
Quotes are not escaped properly in the YAML output, e.g. http://en.wikipedia.org/w/api.php?action=query&prop=info&titles=%27N_Sync&format=yamlfm This breaks the YAML parser in Ruby: require 'yaml' require 'open-uri' p YAML::load( open('http://en.wikipedia.org/w/api.php?action=query&prop=info&titles=%27N_Sync&format=yamlfm') )
I have also encountered the following pages that are causing errors to the Ruby YAML parser: http://en.wikipedia.org/w/api.php?action=query&format=yamlfm&prop=info%7Crevisions%7Ccategories&titles=Lalo%20Schifrin&rvprop=timestamp%7Cids http://en.wikipedia.org/w/api.php?action=query&format=yamlfm&prop=info%7Crevisions%7Ccategories&titles=Lisa%20Gerrard&rvprop=timestamp%7Cids In both cases this seems to be an error with the formatting of the categories YAML (one has a line break, one has a ':' in the title).
I did take a look at this one, since I also figured out that titles containing ": " fails parsing in Ruby. According to the YAML 1.0 specification (http://yaml.org/spec/history/2004-01-29/2004-01-29.html#id2569840) " #" and ": " (also string starting with "!!", "[" and some others) are forbidden in so-called 'plain style' scalar syntax. When I take a look at ApiFormatYaml_spyc.php, function _dumpNode only supports plain style: // It's mapped $string = $spaces.$key.': '.$value."\n"; This is a too simplistic approach to render YAML, in some situations. To solve this, the _dumpNode function needs to be extended with a kind of YAML escape algorithm when plain style is not possible.
I've committed a fix in r31927 which (hopefully, don't have a YAML parser handy) fixes this issue. Requesting api.php?action=query&prop=info&titles=Main_Page|Talk:Main_Page now results in what I hope is correct YAML (those with YAML parsers, please test!). Note the difference between: title: Main Page and: title: | Talk:Main Page The entire YAML output of the sample request is at the end of this message for completeness's sake. The criteria I used are: * If the string contains newlines, use literal syntax (with the | character and all that) (was already present) * If the string starts with : or # use literal syntax * If the string starts with any of - ? , [ ] { } ! * & | > ' " % @ ` also use literal syntax * In all other situations, use plain syntax (folded if the string is longer than 40 characters) YAML CODE STARTS HERE --- query: normalized: - from: Main_Page to: Main Page - from: | Talk:Main_Page to: | Talk:Main Page pages: - pageid: 54 ns: 0 title: Main Page touched: | 2008-03-06T17:36:33Z lastrevid: 440 counter: 86 length: 76 - pageid: 12 ns: 1 title: | Talk:Main Page touched: | 2008-03-11T15:09:07Z lastrevid: 448 counter: 64 length: 173 YAML CODE ENDS HERE
(In reply to comment #3) > * If the string starts with : or # use literal syntax That should be: "If the string *contains* : or #" (good catch, Loek)