Last modified: 2010-08-30 10:00:16 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T26234, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 24234 - xml validation of generated rdf fails at !ENTITY declaration because of incorrect url coding in localized name of "Special"
xml validation of generated rdf fails at !ENTITY declaration because of inco...
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Semantic MediaWiki (Other open bugs)
unspecified
All All
: Normal normal with 1 vote (vote)
: ---
Assigned To: Markus Krötzsch
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-02 14:32 UTC by Zoltán Baráti
Modified: 2010-08-30 10:00 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Zoltán Baráti 2010-07-02 14:32:14 UTC
MediaWiki  	1.15.4
PHP 	5.2.6-1+lenny8 (apache2handler)
MySQL 	5.0.51a-24

Semantic MediaWiki  (verzió: 1.5.0_0)

Steps to reproduce: 
1.
Install MediaWiki
!>>set Hungarian as the language of the wiki<<!
----------------------
2.
Install Semantic Mediawiki extensions
--------------------------
3.
export rdf of the main page (I do it with "Semantic Radar" Firefox extension )

<link rel="alternate" type="application/rdf+xml" title="Speciális:Névjegy" href="/Z/index.php?title=Speci%C3%A1lis:ExportRDF/Speci%C3%A1lis:N%C3%A9vjegy&amp;xmlmime=rdf" />
=====================================
result:
XML validation  of generated RDF fails

details:
------------------
-Fatal Error Messages
FatalError: The parameter entity reference "%C3;" must end with the ';' delimiter.[Line = 7, Column = 56]
-------------------------------------------------
-The original RDF/XML document

1: <?xml version="1.0" encoding="UTF-8"?>
2: <!DOCTYPE rdf:RDF[
3: 	<!ENTITY rdf 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'>
4: 	<!ENTITY rdfs 'http://www.w3.org/2000/01/rdf-schema#'>
5: 	<!ENTITY owl 'http://www.w3.org/2002/07/owl#'>
6: 	<!ENTITY swivt 'http://semantic-mediawiki.org/swivt/1.0#'>
7: 	<!ENTITY wiki 'http://zolta.homelinux.org/Z/w/Speci%C3%A1lis:URIResolver/'>
8: 	<!ENTITY property 'http://zolta.homelinux.org/Z/w/Speci%C3%A1lis:URIResolver/Property-3A'>
9: 	<!ENTITY wikiurl 'http://zolta.homelinux.org/Z/w/'>
10: ]>
11: 
12: <rdf:RDF
13: 	xmlns:rdf="&rdf;"
14: 	xmlns:rdfs="&rdfs;"
15: 	xmlns:owl ="&owl;"
16: 	xmlns:swivt="&swivt;"
17: 	xmlns:wiki="&wiki;"
18: 	xmlns:property="&property;">
19: 	<!-- Ontology header -->
20: 	<owl:Ontology rdf:about="&wikiurl;Speci%C3%A1lis:ExportRDF/KezdQlap">
21: 		<swivt:creationDate rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2010-07-02T16:18:23+02:00</swivt:creationDate>
22: 		<owl:imports rdf:resource="http://semantic-mediawiki.org/swivt/1.0" />
23: 	</owl:Ontology>
Comment 1 Zoltán Baráti 2010-07-02 15:29:17 UTC
EXPECTED

urls in ENTITY declaration should be encoded with XML Character Reference
á
should be encoded
&#225; 

line 7 and 8 should be
------------------------
7:     <!ENTITY wiki
'http://zolta.homelinux.org/Z/w/Speci&#225;lis:URIResolver/'>
8:     <!ENTITY property
'http://zolta.homelinux.org/Z/w/Speci&#225;lis:URIResolver/Property-3A'>
===================================
Question
1. Where in the code jungle is the line we need to change
2. What PHP function does this encoding?
Comment 2 Zoltán Baráti 2010-07-02 15:40:32 UTC
-I guess the problem is in

$IP/extensions/SemanticMediaWiki/includes/export/SMW_Exporter.php

-------------------------
-maybe the line: 
SMWExporter::encodeURI(urlencode(str_replace(' ', '_', $wgContLang->getNsText(SMW_NS_PROPERTY) . ':')));
Comment 3 Karima Rafes 2010-08-12 00:52:05 UTC
You can check your exportRDF with this form :

http://www.w3.org/RDF/Validator/
Comment 4 Karima Rafes 2010-08-12 08:26:13 UTC
	
I found a quick fix.
I added str_replace('%','&#37;',URL) in the file include/export/SMW_OWLExport.php in the function printHeader.

protected function printHeader( $ontologyuri = '' ) {
		global $wgContLang;

		$this->pre_ns_buffer .=
			"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" .
			"<!DOCTYPE rdf:RDF[\n" .
			"\t<!ENTITY rdf '"   . SMWExporter::expandURI( '&rdf;' )   .  "'>\n" .
			"\t<!ENTITY rdfs '"  . SMWExporter::expandURI( '&rdfs;' )  .  "'>\n" .
			"\t<!ENTITY owl '"   . SMWExporter::expandURI( '&owl;' )   .  "'>\n" .
			"\t<!ENTITY swivt '" . SMWExporter::expandURI( '&swivt;' ) .  "'>\n" .
			// A note on "wiki": this namespace is crucial as a fallback when it would be illegal to start e.g. with a number. In this case, one can always use wiki:... followed by "_" and possibly some namespace, since _ is legal as a first character.
			"\t<!ENTITY wiki '"  . str_replace('%','&#37;',SMWExporter::expandURI( '&wiki;' )) .  "'>\n" .
			"\t<!ENTITY property '" . str_replace('%','&#37;',SMWExporter::expandURI( '&property;' )) .  "'>\n" .
			"\t<!ENTITY wikiurl '" . str_replace('%','&#37;',SMWExporter::expandURI( '&wikiurl;' )) .  "'>\n" .
			"]>\n\n" .
			"<rdf:RDF\n" .
			"\txmlns:rdf=\"&rdf;\"\n" .
			"\txmlns:rdfs=\"&rdfs;\"\n" .
			"\txmlns:owl =\"&owl;\"\n" .
			"\txmlns:swivt=\"&swivt;\"\n" .
			"\txmlns:wiki=\"&wiki;\"\n" .
			"\txmlns:property=\"&property;\"";
		$this->global_namespaces = array( 'rdf' => true, 'rdfs' => true, 'owl' => true, 'swivt' => true, 'wiki' => true, 'property' => true );

		$this->post_ns_buffer .=
			">\n\t<!-- Ontology header -->\n" .
			"\t<owl:Ontology rdf:about=\"$ontologyuri\">\n" .
			"\t\t<swivt:creationDate rdf:datatype=\"http://www.w3.org/2001/XMLSchema#dateTime\">" . date( DATE_W3C ) . "</swivt:creationDate>\n" .
			"\t\t<owl:imports rdf:resource=\"http://semantic-mediawiki.org/swivt/1.0\" />\n" .
			"\t</owl:Ontology>\n" .
			"\t<!-- exported page data -->\n";
	}
Comment 5 Markus Krötzsch 2010-08-30 10:00:16 UTC
I have now implemented the above "quick fix" which I think is the proper way to solve the problem. The symbol % that is correctly used in the problematic URLs has a special meaning in XML ENTITY declarations and must be escaped in this way. The fix will be released with SMW 1.5.2.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links