Last modified: 2009-09-03 08:28:42 UTC
Created attachment 6480 [details] Rendered PDF book from a short page in vi.wikisource I suggest that this extension uses Unicode fonts for the content of exported PDF books. With this non-unicode font currently, it is useless in some projects like Vietnamese Wikisource. Note the font in the main content, special unicode words of Vietnamese (ố, ẳ, etc.) are changed to other fonts; and the header of License page, it seems that the render machine even not recognize them.
This issue is a little tricky. We are using Unicode fonts, unfortunately there is no single font which covers the whole unicode range. Therefore PDFs are currently rendered using up to 19 different fonts. In the case of vietnamese the result is not perfect: Many glyphs can be rendered with the regular font for the latin alphabet (DejaVuSerif), but quite a few glyphs are missing in that font. Therefore the writer switches to DejaVuSans which has these glyphs. Possible solutions: * Switch to a different main font which has a better glyph coverage. Prerequisite for that is a "font family" with a serif, sans serif and monospace font. Each of these fonts needs a better glyph coverage as the currently used DejaVu fonts. * Select different main fonts depending on the language of the wiki article. This has the consequence that a PDF containing articles from multiple wikis would have different main fonts. If e.g. vietnamese and english articles would be mixed, this might look rather odd. * ... Any thoughts on that anyone? Until there is no solution that really improves the current situation I would like to keep things as they are. I think, PDFs from the vietnamese wiki are still usable even though the font switches are not ideal and look a litte odd.
I made a search and found that: * Vietnamese characters are scattered in Latin, Latin-1 Supplement, Latin Extended A, Latin Extended B and Latin Extended Additional in Unicode fonts table. * The problem is that DejaVu fonts haven't fully covered *Latin Extended Additional*. This is the unicover.txt from the latest DejaVu (2.30, 200-08-27): Sans Serif Sans Mono U+1e00 Latin Extended Additional 96% (248/256) 76% (196/256) 71% (182/256) So I suggest that two possible solutions: * Switch to another open typefaces, such as, [http://en.wikipedia.org/wiki/Free_UCS_Outline_Fonts Free UCS Outline Fonts], which have better coverage, and is released under GPLv3? * Keep things as they are, and wait for DejaVu to update their fonts. But I saw that it took them 6 months to update from 2.29 to 2.30, and unless some volunteer developer created those characters. We would wait long. I refer the first suggestion, but I hoped it will not violate extension's license or your favorite fonts in a book. But for Vietnamese, who get used to see many (commercial) full fonts (all over Internet), it would be hurt to see some odds in the documents, and will not render PDF books in Wiki anymore (I am a good example, I think).
Thanks for the hints, I will take a look at the UCS Outline Fonts next week.
Thanks again for the hint with the above font. After some testing I just switched the default font from DejaVu to FreeFont. The problems with Vietnamese articles should now be solved. The above referenced article for example renders correctly now: http://vi.wikisource.org/wiki/Ai_l%C3%A0m_%C4%91%C6%B0%E1%BB%A3c I will probably use FreeFont for more scripts in the future.