Last modified: 2009-07-30 13:14:08 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 19062 - Escaped HTML-entities do not round-trip when using forms
Escaped HTML-entities do not round-trip when using forms
Product: MediaWiki extensions
Classification: Unclassified
SemanticForms (Other open bugs)
All All
: Normal normal (vote)
: ---
Assigned To: Yaron Koren
Depends on:
  Show dependency treegraph
Reported: 2009-06-03 07:53 UTC by Markus Krötzsch
Modified: 2009-07-30 13:14 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Markus Krötzsch 2009-06-03 07:53:51 UTC
The above example URL shows a wiki where XML-documents are managed using templates and forms. XML uses many characters that also have special meaning in MediaWiki, but most of those do not need any escaping, since they do not occur in contexts that MW recognizes. There are two exceptions:

* Comments in XML are written as in MediaWiki. To prevent MW from interpreting them, it is necessary to write "&lt;!--" on the wiki page instead of "<!--".

* XML entities are unescaped by MediaWiki during parsing. So to get an escaped entity like "&lt;" into a semantic property value, one needs to write "&amp;lt;" in the input.

Both encodings work in MW and SMW. (A more general but less user-friendly strategy would be to escape all <, >, ", ... in the input fields. This would be more systematic, but most cases work pretty well without such escapes.)

Now the problem is that, when editing with SRF, the text that is loaded into the form does not contain the original entities, i.e. it shows "<!--" where the page included "&lt;!--" and it shows "&lt;" where the page included "&amp;lt;". Either SF does too much unescaping, or it simply passes on literal escaped text to the browser who unescapes it for display. In any case, "edit with form" changes the page contents even if the user does not modify the form contents at all.

Tested on FF 3.0.10, MW 1.14alpha,  SMW 1.5e-SVN, SF 1.6.
Comment 1 Markus Krötzsch 2009-06-03 08:15:31 UTC
The problem also occurs for #-style escapes as required for working around Bug 19063. In other words, the workaround, after being implemented by a knowledgeable user, will be destroyed as soon as the page is edited by someone who is not aware of this, or who forgets to re-encode one occurrence of the escapes after editing.
Comment 2 Yaron Koren 2009-06-25 22:35:32 UTC
This is fixed in version 1.7.2.
Comment 3 Markus Krötzsch 2009-07-05 12:55:19 UTC
I cannot confirm the fix after upgrading to SRF 1.7.3. Using "edit with form" on the below example pages (using HTML and hex entities, respectively) still shows the expanded characters, and round-tripping fails when saving.

Example pages:
Comment 4 Yaron Koren 2009-07-26 19:11:38 UTC
Okay, this time (in version 1.8) I think it's *really* fixed. Feel free to re-open again if not, though.
Comment 5 Markus Krötzsch 2009-07-30 13:14:08 UTC
Confirmed: the bug seems to be fixed now. Very nice!

Note You need to log in before you can comment on or make changes to this bug.