Last modified: 2013-11-02 13:35:43 UTC
Many aspects of TeX rendering depend on local traditions. The most essential are: * National characters in formulae and indices cannot be entered. * The [[:en:decimal separator]] is a comma in many countries (ru, fr, de and others). TeX by default treats a comma as a list separator, adding a very large skip after in: compare <math>3.14\,\!</math> and <math>3,14\,\!</math>. Actually, we can write <math>3{,}14</math> but it's a dirty hack. * Some functions have national variants of designation, for example, '''tg''' instead of '''tan'''. To avoid clumsy <code>\mathop</code> construction, we need a method for entering such names. A reasonable way for fighting with all these inconveniences is to allow national versions having their own TeX macros in preamble. There exists "babel" in LaTeX, why not in texvc?
(In reply to comment #0) > * National characters in formulae and indices cannot be entered. Please confirm that this is indeed the case, the math package supports that if you configure it correctly, but it hasn't been setup like that on Wikimedia sites. And please submit bugs that describe one and only one issue, full UTF-8 support in the math pagkage and some kind of template support are totally different issues, marking this as INVALID,
It is a bug of the wikipedia.org installation and possible other installations which pretend to support more than one language. Converting to UTF-8 does not resolve all localization problems, even if the TeX engine will be moved to a fully-unicode version (Omega?). The point is not to try fitting all possible localization problems in one basket but to put language-dependant things in language-dependant places.
It depends on bug 798, but is not the same
Example: the Spanish word for "limit" is "límite", which is abbreviated "lím". There is no way to write this with the current Texvc implementation - \acute{i} is not rendered as í, but as an acute accent over an 'i' with a dot. Besides, <math>\lim_{x\to0}\frac{\sin(x)}{x}</math> is correctly rendered with the <math>x\to0</math> part under the "lim", while <math>\mbox{l}\mbox{\acute{i}}\mbox{m}_{x\to0}\frac{\sin(x)}{x}</math> is not, and it looks awful. I know from experience that loading the "babel" package with the appropriate option solves almost all of the problems, at least in Spanish, and probably for the other languages too.
(In reply to comment #1) > And please submit bugs that describe one and only one issue, full UTF-8 support > in the math pagkage and some kind of template support are totally different > issues, marking this as INVALID, It's likely that this describes one and only one issue. The bug may not be in Unicode parsing, but in misinterpretation by texvc of non-ASCII characters, which is in all likelihood triggered by lack of locale support. The later is usually ensured by the babel LaTeX package, which also localizes abbreviations for mathematical functions.
I would strongly advise against using different setups on different Wikipedias. This is simply asking for all kinds of weird problems. If you need \tg etc., just add it to texvc as alias for \mathop{tg}, something like that in the big case statement in texutil.ml: | "\\tg" -> LITERAL (HTMLABLEC(FONT_UFH,"\\mathop{tg} ","tg")) The tests seemed to indicate that one cannot write accented characters directly in math mode, so I'm not sure how you want to get \mathop{lím} there.
Why should localization of texvc through the use of a well-tested and widely-used LaTeX package cause "weird problems"?
(In reply to comment #6) > I would strongly advise against using different setups on different Wikipedias. > This is simply asking for all kinds of weird problems. Echo Alon Lischinsky. > If you need \tg etc., just add it to texvc as alias for \mathop{tg}, > something like that in the big case statement in texutil.ml: > | "\\tg" -> LITERAL (HTMLABLEC(FONT_UFH,"\\mathop{tg} ","tg")) And how, exactly, do I "just add it"? > The tests seemed to indicate that one cannot write accented characters directly > in math mode, so I'm not sure how you want to get \mathop{lím} there. However it can be done. I don't really care how, as long as it works.
Alright, let's keep the priorities sensible. And drop the attitudes, please. Thanks.
(In reply to comment #6) > I would strongly advise against using different setups on different Wikipedias. > This is simply asking for all kinds of weird problems. Well, Wikimedia Foundation charter states that Wikimedia is a multilanguage project, so localization of resources is something MediaWiki should attempt to do, IMO. babel package is supposed to work with both LaTeX and Plain-TeX, so we have a tested package as a tool. I would like to hear a technical, rather than a political opinion, for why babel (or any other type of regionalization) is not practical or feasible for WikiMedia's texvc implementation.
If the same source renders differently on each wiki, it'll complicate the shared storage of rendered images. More generally, a multilingual solution rather than monolingual is preferred, so all languages can be used on all sites.
(In reply to comment #11) > More generally, a multilingual solution rather than > monolingual is preferred, so all languages can be used on all sites. But while such a solution is found, it would be better to use babel instead of waiting. Back in the days where article titles in the English WP could only use letters from Latin-1, WPs in languages whose alphabet wasn't included in Latin-1 (such as Polish or Esperanto) were allowed to name their articles in their own character encoding (or was it Unicode? I don't remember), instead of waiting for a multilingual solution to be implemented on all the WikiMedia projects. Well, why not do the same here?
Please stop manipulating the priority tags; it doesn't do you any good and just annoys people, making it less likely anyone will want to do whatever you are asking for.
(In reply to comment #11) > If the same source renders differently on each wiki, it'll complicate the shared > storage of rendered images. More generally, a multilingual solution rather than > monolingual is preferred, so all languages can be used on all sites. But, anyhow, mathematical convensions are different in different languages, so it is likely that either the formulas use different sources (some with workarounds) which make the argument of shared storage void, or just share the source making the article/book look like a shiftcoded work. This is, IMO, still a political rather than a technical argument: you are not saying it is imposible/too complicated/prone to unseen failures, but rather that this is inconvenient from the point of view of shared resources, when shared resources are, in this context, either something that is not happening or an imposition of language convensions.
No, it's a technical argument. If simply turned on, this would cause different renderings to stomp on each other and make rendered display inconsistent.
(In reply to comment #15) > No, it's a technical argument. If simply turned on, this would cause different > renderings to stomp on each other and make rendered display inconsistent. Not inconsistent, but rather consistent with the language setting for the local wiki (or, if it could be implemented, for the user viewing the page). You're placing technical ease-of-maintenance (or worse, technical resource optimization) above usability, a flawed line of reasoning which, if pursued, could lead to producing an extremely efficient product that doesn't fulfill the users' needs. I don't see why non-English speakers shoudl be forced to employ English, a language they may not even understand, for their formulae.
No, *inconsistent* because it would change back and forth depending on who rendered it last.
(In reply to comment #17) > No, *inconsistent* because it would change back and forth depending on who > rendered it last. Only if PNG caching is done independently of user language setting. Which need not be the case: just like thumbnailed images are rendered as per the user's thumbnail size setting selection, LaTeX rendering can be linked to interface language. And it should, unless your particular criterion is, as I said before, that technical optimization should not take end-user satisfaction into account.
Why was this tagged as fixed? I've just checked, and \lim renders without an accent on es:.
This is quite an important bug since it affects all other languages (that just do not use English characters) These problems have been so long standing and seems no one bother to fix them. It would be great if globalization is being paid attention to. This issue particularly affects Thai wikipedia and is having problems writing equations with words in it.
did you try to use \text if you want to insert text or other symbols that have no specific math rendering. I think it makes little sense to support characters that are not supported by LaTeX since it's not defined how they look like. So I'd recommend to use \text{special words characters or symbols} to get the browsers rendering for those parts of the equation. There are some cases were the text method does not work. If you discover such situations that are not covered by https://bugzilla.wikimedia.org/show_bug.cgi?id=48032 Please open a new bug with a specific example. *** This bug has been marked as a duplicate of bug 48032 ***
Moritz, I don't think this is a duplicate. In TeX/LaTeX terms, this is about the babel package. E.g., babel set to Spanish will render \lim, \max, min etc. as lím, máx, mín etc. At least for MathJax, this can be solved by simply writing sets of macros. (Not sure how the math extension / database side would work.)
Peter, thanks for the hint. Obviously, I misunderstood that. I would see that as new feature for one version in the future. Hopefully (Repopend) is the right category for that. I also added it to the Roadmap. https://www.mediawiki.org/wiki/Extension:Math/Roadmap It's also a nice example to demonstrate the usefulness of Content MathML.
Physikerwelt, Thank you for looking into this. It's been long 7 years coming that I actually have forgotten about it.