Last modified: 2011-11-29 03:21:02 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T27753, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 25753 - Mediawiki XML file version 0.4 does not validate against its own DTD file
Mediawiki XML file version 0.4 does not validate against its own DTD file
Status: RESOLVED FIXED
Product: Datasets
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal normal with 1 vote (vote)
: ---
Assigned To: Tomasz Finc
: shell
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-11-02 16:40 UTC by Rodrigo Sampaio Primo
Modified: 2011-11-29 03:21 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch to fix issues on Mediawiki DTD (548 bytes, patch)
2010-11-02 16:40 UTC, Rodrigo Sampaio Primo
Details
Script used to test the validation (208 bytes, application/unknown)
2010-11-02 16:41 UTC, Rodrigo Sampaio Primo
Details

Description Rodrigo Sampaio Primo 2010-11-02 16:40:46 UTC
Created attachment 7778 [details]
Patch to fix issues on Mediawiki DTD

Hi,

I'm trying to validate a Mediawiki XML file against its DTD file but the validation is failing. I'm trying using PHP DOMDocument, I haven't tried the validation with other tools so I can't be sure if the problem is on PHP or Mediawiki XML file, but I guess it is more likely to be on Mediawiki.

I'm testing with the attached script (testMediawikiXml.php). When I try to validate the XML from http://en.wikipedia.org/wiki/Special:Export/Train I get the following error:

Element '{http://www.w3.org/2001/XMLSchema}element': The attribute 'name' is required but missing

This error can be fixed by commenting line 119 of http://www.mediawiki.org/xml/export-0.4.xsd. The content of this line is:

<element minOccurs="0" maxOccurs="1" type="mw:DiscussionThreadingInfo" />

I guess the best solution is to add the "name" attribute but I haven't investigate and I don't know much about DTD to know what should be the value of the "name" attribute.

If I try to run the script again another error occurs:

Element '{http://www.mediawiki.org/xml/export-0.4/}namespace', attribute 'case': The attribute 'case' is not allowed.

To fix this one I have added the following line below line 92:

<attribute name="case" type="string" />

After those two changes to the DTD file I'm able to validate the XML file. I'm attaching the script I'm using to test and a patch with the changes I made to the DTD file. I guess that the second change is ok but the first issue need to be properly fixed (instead of just commenting the line).

Thanks, Rodrigo.
Comment 1 Rodrigo Sampaio Primo 2010-11-02 16:41:32 UTC
Created attachment 7779 [details]
Script used to test the validation
Comment 2 Roan Kattouw 2010-11-04 10:46:04 UTC
Tomasz, weren't you the one that last messed around with this?
Comment 3 Chad H. 2010-12-29 09:23:19 UTC
These are both fine in trunk and 1.16wmf4.

However, the XSD file in /usr/local/apache/common/docroot/mediawiki/xml needs updating and I'm not sure how to sync files from there to the cluster.

Tweaking to be a shell request bug.
Comment 4 Chad H. 2010-12-30 19:15:48 UTC
File has been sync'd to the servers and purged from squid. Should validate now.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links