Last modified: 2013-09-04 12:33:20 UTC
The issue I am facing is that all pure Ampersand "&" present in a page content are replaced by their HTML entity & Even when allowing raw HTML, $wgRawHtml=true, and surrounded by HTML tags, <html></html>, the ampersand are replaced. I would like to keep pure &, because the users should be able to add some Javascript in their pages. But with this replacement the & used as a logic operator is corrupted, and the Javascript as well. Here is a small content to explain and show the problem. Just add this content to a page and check the source. --------------------- <html> ampersand : & pure ampersand: & (should not be replaced) </html> ---------------------- Is there a solution to this problem, or will it be fixed in the next version ? Thank you very much !
Sure & should work correctly in JavaScript, just as it does in URLs. The XML parser is supposed to replace it with & before passing it on to the JavaScript parser or anything else. It really doesn't work? Try using <![CDATA[ ... ]]> around your JavaScript.
(In reply to comment #1) > Sure & should work correctly in JavaScript, just as it does in URLs. The > XML parser is supposed to replace it with & before passing it on to the > JavaScript parser or anything else. It really doesn't work? Try using > <![CDATA[ ... ]]> around your JavaScript. I can confirm that this doesn't work correctly (tested on a private wiki with the <html> tag enabled, MediaWiki version 1.9.0). Test code: <html> <script> // <[CDATA[ alert("Testing the & sign"); // ]]> </script> </html> Output in page's HTML: <script> // <[CDATA[ alert("Testing the & sign"); // ]]> </script> and the script displays the message Testing the & sign in the alert box that comes up. I've actually written scripts inside HTML tags on that wiki, and it's been a pain having to express a&&b as !(!a||!b)...
The <![CDATA[ ... ]]> would be used to allow a raw & in the source to pass the XML parser correctly. It would ensure that & is interpreted as & *instead of* the & it otherwise would be. Note that in HTML 4, <script> contents are defined as CDATA already, which is why common browsers already handle it that way. In pure XHTML, this wouldn't be implied, which is why we make it explicit in our own output. Since various processing is done on the output code even after the <html> sections are done, currently it may not be possible to get nice 'clean' output of this sort.
(mid-air collision) Right, right, of course <![CDATA[ will just muck things up further, the MW parser doesn't recognize it. But it's an error to output a literal &, in <script> or elsewhere, that doesn't begin an entity. It should work correctly as &, as far as I can tell. Unfortunately, testing in Firefox, it does not. <![CDATA[ seems to be the only way to get this to work, so for this to function correctly MediaWiki would have to either insert <![CDATA[ ... ]]> intelligently inside <script> and maybe <style> tags, and not HTML-escape those; or else just not clean them at all. Is it Tidy doing the cleaning, or the Sanitizer? Does the entity get replaced even with Tidy off?
On http://en.wikipedia.org/wiki/Special:Watchlist, I get a javascript error every time I refresh due specifically to this bug. The following line is in the header section <script type="text/javascript" src="http://en.wikipedia.org/w/index.php?title=-&action=raw&gen=js&useskin=monobook"><!-- site js --></script> Because the ampersands are not handled correctly, that line returns an html text page instead of the expected javascript. This has just started happening in the last day or so.
(In reply to comment #5) > On http://en.wikipedia.org/wiki/Special:Watchlist, I get a javascript error > every time I refresh due specifically to this bug. The following line is in the > header section > > <script type="text/javascript" > src="http://en.wikipedia.org/w/index.php?title=-&action=raw&gen=js&useskin=monobook"><!-- > site js --></script> > > Because the ampersands are not handled correctly, that line returns an html > text page instead of the expected javascript. > > This has just started happening in the last day or so. > No, that's normal, this bug is about & being replaced with & in the <script> body, not in the src parameter value. For example: <html><script type="text/javascript">if(skin && stylepath) alert('woo')</script></html> will break, whereas <html><script type="text/javascript" src="http://en.wikipedia.org/w/index.php?title=MediaWiki:Common.js/watchlist.js&action=raw&ctype=text/javascript"></script></html> will correctly escape the & to & and the browser expects and understands this. What error are you getting exactly? that gen=js appears on every page load, not just watchlists, and is what loads MediaWiki:Common.js and MediaWiki:SKINNAME.js (probably Monobook). http://en.wikipedia.org/wiki/MediaWiki:Common.js/watchlist.js is loaded just on the watchlist page, so possibly an error there.
The exact error is Line: 8 Char: 2 Error: Expected identifier, string or number Code: 0 URL: http://en.wikipedia.org/wiki/Special:Watchlist I got to the ampersands by saving the html locally and debugging one line at a time. However, I checked (as suggested) and other pages which do not produce errors have the same line. I tried a more complete test (I left in all the code). Running from my hard drive, I first converted the relative links to absolute links. Now, there are 2 errors. Commenting out the line I indicated above stopped them both. However, the errors are different when running locally, so it appears that I was wrong. Line: 2 Char: 1 Error: invalid character Code: 0 URL: file://path to my test case BTW, I am running IE 6.
(In reply to comment #7) > The exact error is > Line: 8 > Char: 2 > Error: Expected identifier, string or number > Code: 0 > URL: http://en.wikipedia.org/wiki/Special:Watchlist > I got to the ampersands by saving the html locally and debugging one line at a > time. However, I checked (as suggested) and other pages which do not produce > errors have the same line. I tried a more complete test (I left in all the > code). Running from my hard drive, I first converted the relative links to > absolute links. Now, there are 2 errors. Commenting out the line I indicated > above stopped them both. However, the errors are different when running > locally, so it appears that I was wrong. > Line: 2 > Char: 1 > Error: invalid character > Code: 0 > URL: file://path to my test case > BTW, I am running IE 6. The problem went away today at 2pm, last saw the problem at 6am.
I have hit this bug on wikimediafoundation.org. I was trying to use logical and "&&" in my javascript and the parser changed both giving "&&".
Bump, caused in issue again on wikimediafoundation.org. Makes code really annoying to write.