Last modified: 2014-10-17 11:43:52 UTC
The XeLaTeX backend for the new OCG PDF renderer does not fallback if the font selected for a given language does not contain a given codepoint. For many possible fonts for a given language, the latin code pages are not included. This makes page numbers, dates, citation numbers, and even bullets in lists render as tofu (blank square boxes). The latex renderer should keep track of which code pages are present in a font, and add explicit font-switch commands to the output when needed. (Including redefining the command used for bullets in lists to ensure it is rendered in a latin font.) Ideally we could automatically generate coverage tables from a font. But at *least* we should treat the Latin codepoints as a special case, and fall back to the default latin font when latin codepoints are used in a font without latin codepage coverage. (This has been an issue for Russian, Indic languages, the google noto fonts, etc, and accounts for most of the present difficultly in choosing an appropriate font for a particular wiki language.)
Change 151360 had a related patch set uploaded by Cscott: Use Lohit fonts when possible. https://gerrit.wikimedia.org/r/151360
Change 151360 merged by jenkins-bot: Use Lohit fonts when possible. https://gerrit.wikimedia.org/r/151360
The above patches partially fix the problem -- they switch to the default latin font for latin code pages. But we should really enumerate the full set of code points mapped by a font. That's part two of fixing this bug.
Note that the default latin font doesn't cover the ~ character (!), which is used in https://en.wikipedia.org/wiki/Moon#Internal_structure in the sentence, "this is only ~20% the size of the Moon, in contrast to the ~50% of most other terrestrial bodies".
This sort of font hardcoding really doesn't scale... https://gerrit.wikimedia.org/r/#/c/151360/1/lib/index.js,cm Is it really impossible to load the appropriate fonts as installed on the server (we already install them for EasyTimeline etc. (e.g. bug 20825)? See also the ULS fontrepo: https://git.wikimedia.org/tree/mediawiki%2Fextensions%2FUniversalLanguageSelector/HEAD/data%2Ffontrepo