Last modified: 2014-02-08 16:43:52 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T8722, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 6722 - Keep original spacing when parsing <math> formulas
Keep original spacing when parsing <math> formulas
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Math (Other open bugs)
unspecified
PC Linux
: Low normal with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 16719
Blocks: 18912
  Show dependency treegraph
 
Reported: 2006-07-17 16:38 UTC by Emil Jerabek
Modified: 2014-02-08 16:43 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Emil Jerabek 2006-07-17 16:38:36 UTC
Operators like \sin, \log, ... should be separated from their argument by a
space. TeX does this automatically, but texvc puts no space there in HTML mode:
<math>\sin x</math> is rendered as <span class="texhtml">sin<i>x</i></span>,
which looks wrong. It is impossible to work around this problem by inserting an
explicit space, because <math>\sin\,x</math> forces PNG.
Comment 1 JeLuF 2007-06-02 07:13:18 UTC
The problem is that inserting a space in each case is not the right thing, either.
<math>\sin(x)</math> should not be rendered as "sin (x)". 

The fix would be to preserve the original spacing. 

=> Changing summary.
Comment 2 JeLuF 2007-06-02 07:13:35 UTC
*** Bug 9022 has been marked as a duplicate of this bug. ***
Comment 3 JeLuF 2007-06-02 07:15:02 UTC
Text of Bug 9022:
----------------------------------------------------
I'm trying to get the mhchem package working on my mediawiki install, but am
running into a problem with whitespace. 
Mhchem is set up correctly, and I can access it from the command line, but
mediawiki doesn't render anything correctly. After 
further investigation, I found that texvc is stripping the whitespace from each
equation before rendering it.

So when I type:
<math>\ce{H+ + OH- <=>> H2O}</math>
on the wiki

The .tex file is:
\ce {H-+OH-<=>>H2O}

How can I preserve the whitespace so that my equations render correctly?
Comment 4 Emil Jerabek 2007-06-06 14:42:03 UTC
(In reply to comment #1)
> The problem is that inserting a space in each case is not the right thing,
> either.
> <math>\sin(x)</math> should not be rendered as "sin (x)". 
> 
> The fix would be to preserve the original spacing. 
> 
> => Changing summary.
> 

Is it wise? The problem is an incompatibility of texvc HTML rendering with TeX PNG rendering of the same formula. Making the texvc translation sensitive to whitespace would only create *another* incompatibility, and a serious one I suspect: it is likely that loads of <math> tags in Wikipedia rely on the usual TeX rules for ignoring spaces. Arguably, <math>\sin(x)</math> *should* be rendered in HTML as "sin (x)", because that's what already happens in PNG. (Actually, thin spaces would be more appropriate in both cases.)

I do not understand in what sense is this bug related to 9022. AFAICS the issue there is that texvc strips whitespace even when the equation is passed to TeX (which is usually harmless, but here it makes a difference because of some macro which uses spaces to split its argument or some such). The presence or absence of spaces on input has no effect whatsoever on TeX processing of $\sin(x)$.
Comment 5 JeLuF 2007-06-07 01:41:58 UTC
<math>\sin(x)</math> is rendered as 

<span class="texhtml">sin(<i>x</i>)</span>

There's no space in front of the (. That's how it should be.

<math>\sin x</math> is rendered as 

<span class="texhtml">sin<i>x</i></span>

with no space between sin and x, which is wrong.

When the original spacing is kept, TeX produces the right output, as would HTML. 
Comment 6 Emil Jerabek 2007-06-07 10:36:04 UTC
(In reply to comment #5)
> When the original spacing is kept, TeX produces the right output, as would
> HTML. 
> 

TeX produces the right output whether the original spacing is kept or not (with a few exceptions), because the TeX typesetting engine ignores spaces in math mode. The fact that there is an ASCII space character in <math>\sin x</math> is absolutely irrelevant as to whether there should be a space in the output. You can find plenty of cases where the expected output is opposite to the \sin situation, e.g. <math>\forall x</math> should be (and is) rendered in HTML with no space, whereas <math>\alpha\le\beta</math> should be (but is not) rendered with two spaces.

People know that TeX behaves like this, hence spaces in source <math> tags are not correlated to expected spaces in the output, and making the HTML translation suddenly preserve the spacing would produce a lot of bogus spaces in existing WP pages and vice versa. Let alone the fact that the space in <math>\forall x</math> above is *required* for syntactical reasons, as <math>\forallx</math> is unparseable.

What really happens is this. The HTML translation tries to emulate TeX as far as possible. It ignores space characters, because TeX ignores space characters. Then it inserts spaces in some places based on the type of the elements, because TeX inserts spaces there: e.g., <math>x=y</math> is rendered as <i>x</i> = <i>y</i>. However, this part of the translation mechanism is *incomplete*, it misses some cases such as <math>x\le y</math> or <math>\sin x</math>. This is the bug, and making the HTML translation sensitive to input space characters is not going to solve it.

(Caveat: what I say about TeX are facts, whereas what I say about any part of mediawiki is pure speculation based on its observed behaviour.)
Comment 7 Christopher Yeleighton 2007-07-01 19:56:09 UTC
My vote should not be regarded as support for preserving of spaces, I fully agree with Emil.
Test case:
<math> A \times B </math>
Got:
''A''&times;''B'' (unreadable)
Should get:
''A''&nbsp;&times;&nbsp;''B''
Operators and predicate symbols in HTML output should get non-breaking spaces on both sides.  Knuth gave a detailed spacing table for cases where various entities meet; such precision is not needed with HTML output but that table should be regarded as a guideline.  
Exceptions: 
\cdot => &sdot;
\suchthat => :&nbsp;
but
\colon => &nbsp;:&nbsp;

Comment 8 Nicholas Longo 2011-05-02 18:24:36 UTC
Some issues raised in this bug, in particular the way texvc handles the html spacing of \sin x and \sin(x) has been corrected in r86962
Comment 9 Brion Vibber 2011-09-13 19:25:34 UTC
r86962 has been provisionally reverted as there are no tests or even ad-hoc examples of what needs to be tested along with the commit.

Spacing in the HTML output *should* be testable in the parser test cases (mathParserTests.txt) but it might need a tweak to adjust the user math rendering preferences for HTML tests.

Looks like this only describes HTML spacing, not the tex output spacing (which is of course harder to compare in an automated way).
Comment 10 Brion Vibber 2011-09-13 22:58:30 UTC
See bug 18912 comment 12: the patches for this seem to break existing usages such as \sin{x}. Needs more thorough testing.
Comment 11 Brion Vibber 2011-09-14 00:56:43 UTC
That's resolved with all the followups correctly applied -- reapplied with tests on trunk in r97034.
Comment 12 Mario 2012-02-12 21:05:22 UTC
I experience a related problem: when converting the formula 
<math>\log_2 N</math> 
to HTML, the result is 
"log&#160;<sub>2</sub><i>N</i>"
i.e. there is a space between the "log" and the subscript 2 that should not be there, while the space between the 2 and the N is missing.
Comment 13 physikerwelt 2014-02-08 16:23:34 UTC
<math>\log_2 N</math> looks fine html rendering is not used for that example.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links