Last modified: 2013-07-16 22:05:27 UTC
On bug 36496 physikerwelt writes: I think it is really important to have a stable, secure, and long term supported way of math rendering. As I'm working on integrating LaTeXML a rendering engine that converts tex to MathML I was running into a couple of issues. First, Wikipedia uses texvc and not tex, therefore I had to create a list of special user defined commands. MathJax has this list as well. To my mind it's a suboptiomal solution especially with regard to long term support. I propose to come up with a grammar that can be used by a wide audience (e.g. antlr) and convert that into native php code. This grammar converts texvc to tex and eliminates all commands that are not allowed. Second, there are some security aspects, i.e. that someone could put code that is a potential security risk for the visitors. Texvc eliminates this security risk by returning pictures only. On the other side as texvc is a potential security risk for the server, since the rendering must take place on the same machine as the core server works. LaTeXML can use a separate server, but if there is an attack to the network the traffic could be redirected to another server. Therefore the output of LaTeXML must be checked again before returning it to the users browser. If the users browser supports mathml only bugs in the browsers mathml implementation can be a security risk. If not MathJaX is needed to convert MathML to whatever, which comes along with all the Javascript issues. As a result I think that it would be good to seperate the tasks somehow. The Wikimedia Math extension php code should convert texvc to tex and ensure that only valid tex is passed to the rendering engine. Than a standard rendering engine can be used and the final result can be validated according to standard methods. e.g. validate MathML out against the W3C MathML scheme. In summer 2012 I proposed the LaTeXML render at CICM 2012, now it's avalible as opt-in[1]. [1] http://arxiv.org/abs/1304.5475 PS: There is a demo at http://demo.formulasearchengine.com you can click on http://demo.formulasearchengine.com/index.php/Special:Random to get a random page of the english wikipedia... the pages are not cached so it might take about 20 seconds, especially if images has to be loaded from commons
This is a nice idea, there is some millage in the ability search mathematical formula but I suspect many problems. Have you looked at http://www.dessci.com/en/reference/searching/
Thanks for the link. I started to collect formula search engines at http://www.formulasearchengine.com/list_of_current_formula_search_engines There was a task about math search at NTCIR (research.nii.ac.jp/ntcir/workshop/OnlineProceedings10/NTCIR/toc_ntcir.html) this year. I was lobbing for adding the wikipedia dataset to the test collection for the next round. We'll see if that turns out. At least there is now a MathDump script that augments the Wikimedia dumps with MathML;) By the way I should write about that, since that script is more useful to demo the plugin features of the dumpBackup maintenance script that for its original purpose for most people.
What exactly is the scope of this bug (how do we know when it's fixed)?
I'm not sure if that is a bug. I think the actual details are covered by other bugs, e.g. the security aspect is mentioned in 49169
I agree. I'm going to close this. Feel free to file additional Math bugs, but please try to keep them focused. We can use wikitech-l or RFCs (https://www.mediawiki.org/wiki/Requests_for_comment) for broader discussion.