Last modified: 2012-10-15 13:52:21 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T36540, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 34540 - Text edit box encoding problem with PDF
Text edit box encoding problem with PDF
Status: RESOLVED DUPLICATE of bug 35122
Product: MediaWiki extensions
Classification: Unclassified
PdfHandler (Other open bugs)
All All
: High normal (vote)
: ---
Assigned To: Nobody - You can work on this!
: ops
Depends on:
Blocks: Wikisource 41037
  Show dependency treegraph
Reported: 2012-02-20 09:38 UTC by Raul Kern
Modified: 2012-10-15 13:52 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Raul Kern 2012-02-20 09:38:43 UTC
The recognized 'üõöä' characters aren't displayed in text edit box for PDF files.
see: and
Comment 1 Brion Vibber 2012-02-21 19:41:24 UTC
Possibly the text needs to be transcoded during extraction; not sure how this is presently handled.
Comment 2 Brion Vibber 2012-02-21 19:41:38 UTC
(in which case this may belong to PdfHandler ext)
Comment 3 Beau 2012-04-27 17:47:59 UTC
That is definitely not a Proofread extension bug. I am changing the component field to PdfHandler.

On my local installation it works fine - the text is correctly encoded. I suspect this is configuration issue on WMF servers.

The pdftotext command on my system is provided by poppler 0.18.4.
Comment 4 Beau 2012-05-05 08:38:30 UTC

*** This bug has been marked as a duplicate of bug 35122 ***

Note You need to log in before you can comment on or make changes to this bug.