Last modified: 2014-09-23 19:49:26 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T37668, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 35668 - PDF on gu.wikisource only shows squares instead of characters
PDF on gu.wikisource only shows squares instead of characters
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
Collection (Other open bugs)
REL1_19-branch
All All
: Normal normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: i18n
Depends on:
Blocks: 40760
  Show dependency treegraph
 
Reported: 2012-04-03 14:21 UTC by Dhaval
Modified: 2014-09-23 19:49 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
PDF generated from gu.wikisource (50.66 KB, application/pdf)
2012-04-03 14:21 UTC, Dhaval
Details
Comparison of Gurajati rendered in the browser and as PDF with mwlib (693.57 KB, image/png)
2012-05-15 14:24 UTC, Volker Haas
Details

Description Dhaval 2012-04-03 14:21:00 UTC
Created attachment 10371 [details]
PDF generated from gu.wikisource

While exporting in PDF on gu.wikisource, fonts are not randered and hence, instead of characters only boxes are displayed in pdf (see attached). I checked another indic wikisource, to find out whether there is any issue with indic fonts in PDF format, but found Devnagari fonts displayed correctly in Marathi wikisource's PDF.
Comment 1 Srikanth Logic 2012-05-09 18:09:42 UTC
I guess this has something to do with fonts not being present in the PDF Server.  Although bug 28206 might still affect creation of proper Indic books, but I think this bug is even more fundamental than that. I tried printing a Gujarati page on my wiki with $wgCollectionMWServeURL = "http://tools.pediapress.com/mw-serve/"; using the pediapress mw-serve, I got similar pdf with squares. I am still figuring my way setting up collection locally.
Comment 2 Srikanth Logic 2012-05-09 19:21:29 UTC
So i created a symlink as given below and was able to get Gujarati rendered by mw-render.

http://www.mail-archive.com/mwlib@googlegroups.com/msg01073.html

The fontconfig.py in mwlibrl contains reference to the below font. So please check if this font exists.
    {'name': 'Sarai',
     'code_points': ['Gujarati', 'Devanagari'] ,
     'file_names': ['ttf-devanagari-fonts/Sarai_07.ttf'],
     },

If the above font is unavailable,Lohit Gujarati can be used which will be anyway present as part ttf-indic-fonts and below config needs to added to fontconfig.py in mwlibrl and rebuilt

    {'name': 'Lohit Gujarati',
     'code_points': ['Gujarati'] ,
     'file_names': ['ttf-indic-fonts-core/lohit_gu.ttf'],
     },
Comment 3 Dhaval 2012-05-09 20:47:19 UTC
That's very good news Srikanthlogic, seems the thing is moving somewhere at least.

One thing to note, please try not to use lohit fonts, as they are far from the natural gujarati script, more like devnagari and in books, that would be the last thing we would like to use. Don't know which fonts are the basic fonts, but that represents original gujarati script quite closer.
Comment 4 Mark A. Hershberger 2012-05-10 15:36:16 UTC
https://rt.wikimedia.org/Ticket/Display.html?id=2939 for the font installation.
Comment 5 Daniel Zahn 2012-05-11 13:00:44 UTC
the following fonts are now installed on pdf1-3


ii  ttf-bengali-fonts                     1:0.5.0-0ubuntu1              Free TrueType fonts for the Bengali language
ii  ttf-dejavu                            2.23-1                        Metapackage to pull in ttf-dejavu-core and ttf-dejavu-extra
ii  ttf-dejavu-core                       2.23-1                        Vera font family derivate with additional characters
ii  ttf-dejavu-extra                      2.23-1                        Vera font family derivate with additional characters
ii  ttf-devanagari-fonts                  1:0.5.0-0ubuntu1              Free TrueType fonts for languages using the Devanagari script
ii  ttf-gujarati-fonts                    1:0.5.0-0ubuntu1              Free TrueType fonts for the Gujarati language
ii  ttf-indic-fonts                       1:0.5.0-0ubuntu1              Metapackage for free Indian language fonts
ii  ttf-indic-fonts-core                  1:0.5.0-0ubuntu1              Core collection of free Indian language fonts
ii  ttf-kannada-fonts                     1:0.5.0-0ubuntu1              Free TrueType fonts for the Kannada language
ii  ttf-malayalam-fonts                   1:0.5.0-0ubuntu1              Free TrueType fonts for the Malayalam language
ii  ttf-oriya-fonts                       1:0.5.0-0ubuntu1              Free TrueType fonts for the Oriya language
ii  ttf-punjabi-fonts                     1:0.5.0-0ubuntu1              Free TrueType fonts for the Punjabi language
ii  ttf-tamil-fonts                       1:0.5.0-0ubuntu1              Free TrueType fonts for the Tamil language
ii  ttf-telugu-fonts                      1:0.5.0-0ubuntu1              Free TrueType fonts for the Telugu language

implemented in  https://gerrit.wikimedia.org/r/#/c/7282/
Comment 6 Volker Haas 2012-05-14 08:33:09 UTC
The problem with the Gujarati script is two-fold:

a) The current configuration uses an unsuitable font for Gujarati (Sarai_07.ttf)

I have fixed this issue with https://github.com/pediapress/mwlib.rl/commit/ecbaa8b871621a08dc4136fd55d2387925039e95

Please note that I haven't updated the software on the servers because of the second issue.

b) The rendering engine mwlib is using to produce the PDFs is not capable to handle the complex character shaping/ligatures that indic scripts require. Therefore the final PDF is still broken (see the screen-shot I'll attach).

Fixing b) is unfortunately a very complex and time consuming task which involves a couple of unsolved technical problems and is therefore currently not on my agenda. One of the biggest problems is that I haven't found a PDF back-end that would meet all requirements.
Comment 7 Sumana Harihareswara 2012-05-14 13:33:49 UTC
Volker, I think you haven't attached the sample screenshot of a broken PDF yet?
Comment 8 Srikanth Logic 2012-05-14 18:49:52 UTC
Volker, Thanks for the update. Agree that complex rendering would still be a dependency and might take some time to fix that. But as far as the font is concerned, Dhaval points out Lohit font is not good for reading on pdf. May be he could suggest alternatives from ttf-gujarati-fonts (http://packages.debian.org/lenny/all/ttf-gujarati-fonts/filelist) or any free licensed font which can be used in mwlib.rl
Comment 9 Dhaval 2012-05-15 14:19:38 UTC
(In reply to comment #8)
> .....Dhaval points out Lohit font is not good for reading on pdf. May be
> he could suggest alternatives from ttf-gujarati-fonts
> (http://packages.debian.org/lenny/all/ttf-gujarati-fonts/filelist) or any free
> licensed font which can be used in mwlib.rl

I would suggest Raghu is the best fonts to use, aesthetically it is the most natural looking font. However, when we tested on Firefox, there was a rendering issue (see https://bugzilla.wikimedia.org/show_bug.cgi?id=33932 and http://crossbrowsertesting.com/users/34057/screenshots/zc6a1910ebcefa7d4d1c/public). If that's not going to affect us, Raghu is the best.

Btw, what is the other font that's currently used for Gujarati wikis, apart from Lohit? the default font? It would be the best to use that font, and if not then only think of Raghu. Most of the fonts in debian package are too artistic, and are good for headings, etc. but not for a whole book/page.
Comment 10 Volker Haas 2012-05-15 14:24:15 UTC
Created attachment 10599 [details]
Comparison of Gurajati rendered in the browser and as PDF with mwlib
Comment 11 Dhaval 2012-05-15 19:55:46 UTC
(In reply to comment #10)
> Created attachment 10599 [details]
> Comparison of Gurajati rendered in the browser and as PDF with mwlib

Same issue of rendering fonts as was faced on firefox... I think the fonts used in pdf are Lohit, will it differ if we chosed a different font??

See http://www.jainlibrary.org/elib_master/jlib/004501_book_gujarati_21/Narsimha_Mahetana_Pado_004610_TOC.pdf for an example of Gujarati being correctly rendered in PDF.
Comment 12 Daniel Zahn 2012-10-04 18:49:59 UTC
If you know of any additional Debian/Ubuntu font packages that should be installed on the PDF servers, feel free to tell us.

P.S. The link to github.com above gives me a 404
Comment 13 Ralf Schmitt 2012-10-04 19:16:37 UTC
bugzilla is messing up the github links.
Comment 14 Andre Klapper 2012-10-04 22:22:02 UTC
(In reply to comment #13)
> bugzilla is messing up the github links.

Please file a report against product=Wikimedia / component=Bugzilla separately.
Comment 15 Andre Klapper 2013-01-11 14:51:15 UTC
(In reply to comment #12 by Daniel Zahn)
> P.S. The link to github.com above gives me a 404

Link works for me, now that Bugzilla bug 40344 is fixed.
Comment 16 Andre Klapper 2013-07-25 18:49:52 UTC
As per comment 6 b), I currently don't see anything that could be solved by ops. Removing keyword.

Steps to reproduce the problem:
1. Go to https://gu.wikisource.org/wiki/સૌરાષ્ટ્રના_ખંડેરોમાં
2. Click પુસ્તક નિર્માતા નિષ્ક્રિય કરો in side pane
3. Click પુસ્તક બતાવો (૧ પાનું)
4. Choose ડાઉનલોડ: તમારું પુસ્તક ડાઉનલોડ કરવા શૈલી પસંદ કરો અને બટન પર ક્લિક કરો.
   શૈલી: e-book (PDF), 
5. Click ફાઇલ ડાઉનલોડ કરો

Just for the records, other Gujarati font packages included in Fedora:
- kalapi-fonts
- lohit-gujarati-fonts
- samyak-gujarati-fonts

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links