Last modified: 2013-07-16 22:05:27 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T53381, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 51381 - Allow semantic description of <math> formula
Allow semantic description of <math> formula
Status: RESOLVED INVALID
Product: MediaWiki extensions
Classification: Unclassified
Math (Other open bugs)
unspecified
All All
: Unprioritized enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-15 18:14 UTC by Richard Morris
Modified: 2013-07-16 22:05 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Richard Morris 2013-07-15 18:14:51 UTC
On bug 36496 physikerwelt writes:

I think it is really important to have a stable, secure, and long term
supported way of math rendering.
As I'm working on integrating LaTeXML a rendering engine that converts tex to
MathML I was running into a couple of issues.
First, Wikipedia uses texvc and not tex, therefore I had to create a list of
special user defined commands. MathJax has this list as well.
To my mind it's a suboptiomal solution especially with regard to long term
support. I propose to come up with a grammar that can be used by a wide
audience (e.g. antlr) and convert that into native php code. This grammar
converts texvc to tex and eliminates all commands that are not allowed.
Second, there are some security aspects, i.e. that someone could put code that
is a potential security risk for the visitors. Texvc eliminates this security
risk by       returning pictures only. On the other side as texvc is a
potential security risk for the server, since the rendering must take place on
the same machine as the core server works.
LaTeXML can use a separate server, but if there is an attack to the network the
traffic could be redirected to another server. Therefore the output of LaTeXML
must be checked again before returning it to the users browser.
If the users browser supports mathml only bugs in the browsers mathml
implementation can be a security risk. If not MathJaX is needed to convert
MathML to whatever, which comes along with all the Javascript issues.
As a result I think that it would be good to seperate the tasks somehow.
The Wikimedia Math extension php code should convert texvc to tex and ensure
that only valid tex is passed to the rendering engine.
Than a standard rendering engine can be used and the final result can be
validated according to standard methods. e.g. validate MathML out against the
W3C MathML scheme.
In summer 2012 I proposed the LaTeXML render at CICM 2012, now it's avalible as
opt-in[1].

[1] http://arxiv.org/abs/1304.5475

PS: There is a demo at 
http://demo.formulasearchengine.com

you can click on 
http://demo.formulasearchengine.com/index.php/Special:Random

to get a random page of the english wikipedia... the pages are not cached so it
might take about 20 seconds, especially if images has to be loaded from commons
Comment 1 Richard Morris 2013-07-15 18:37:03 UTC
This is a nice idea, there is some millage in the ability search mathematical formula but I suspect many problems. Have you looked at http://www.dessci.com/en/reference/searching/
Comment 2 physikerwelt 2013-07-15 21:30:39 UTC
Thanks for the link.
I started to collect formula search engines at

http://www.formulasearchengine.com/list_of_current_formula_search_engines

There was a task about math search at NTCIR (research.nii.ac.jp/ntcir/workshop/OnlineProceedings10/NTCIR/toc_ntcir.html) this year. 

I was lobbing for adding the wikipedia dataset to the test collection for the next round. We'll see if that turns out. At least there is now a MathDump script that augments the Wikimedia dumps with MathML;) 

By the way I should write about that, since that script is more useful to demo the plugin features of the dumpBackup maintenance script that for its original purpose for most people.
Comment 3 Matthew Flaschen 2013-07-15 22:41:16 UTC
What exactly is the scope of this bug (how do we know when it's fixed)?
Comment 4 physikerwelt 2013-07-16 09:56:16 UTC
I'm not sure if that is a bug.
I think the actual details are covered by other bugs, e.g. the security aspect is mentioned in 49169
Comment 5 Matthew Flaschen 2013-07-16 22:05:27 UTC
I agree.  I'm going to close this.  Feel free to file additional Math bugs, but please try to keep them focused.  We can use wikitech-l or RFCs (https://www.mediawiki.org/wiki/Requests_for_comment) for broader discussion.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links