Last modified: 2013-08-19 07:47:08 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T54881, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 52881 - Provide better PDF viewer based on pdf.js project
Provide better PDF viewer based on pdf.js project
Status: NEW
Product: MediaWiki
Classification: Unclassified
File management (Other open bugs)
unspecified
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-15 12:58 UTC by Tomer Cohen
Modified: 2013-08-19 07:47 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Tomer Cohen 2013-08-15 12:58:13 UTC
pdf.js is PDF viewer implemented in JavaScript that is included by default in Mozilla Firefox (and Chromium, as far as I know). 

The viewer can be embedded in any web page, and I suggest replacing the default PDF viewer with it, as it doesn't require additional processing on the server in order to convert PDF pages to images, and provide a user interface similar to desktop PDF viewers. 

Demo: http://mozilla.github.io/pdf.js/web/viewer.html

Source code: https://github.com/mozilla/pdf.js/
Comment 1 Andre Klapper 2013-08-15 13:17:56 UTC
pdf.js might be a solution to some problem that I don't see yet.
Could you elaborate, please?
Comment 2 Sam Reed (reedy) 2013-08-15 19:03:37 UTC
What do you mean by the default pdf viewer?

WMF wikis have http://www.mediawiki.org/wiki/Extension:PdfHandler which vastly improves default MediaWiki handling of PDF files
Comment 3 Tomer Cohen 2013-08-15 20:07:14 UTC
(In reply to comment #2)
> What do you mean by the default pdf viewer?
> 
> WMF wikis have http://www.mediawiki.org/wiki/Extension:PdfHandler which
> vastly
> improves default MediaWiki handling of PDF files

The extension above convert PDF files to set of images, while each page in the document is represented by the page image, and linked to the page before and after, and each page require loading the webpage over and over. 

By using a desktop PDF viewer (or a browser plugin) the user is seeing the whole document, making it easier to navigate between pages. pdf.js try to do the same in the browser, so users will get better user interface while reading PDF documents online. 


While the current implementation require some server side processing, pdf.js does load the original PDF file and show it in the browser canvas, so implementing it won't require additional changes on the server and can work side-by-side with the current PDF extension which I feel most users dislike.
Comment 4 Greg Grossmeier 2013-08-15 20:11:08 UTC
Personal opinion: this might be good as a "both" type solution. PdfHandler for the current situation that some like (and can be incrementally improved upon, UI-wise) in addition to a "view full PDF in-browser" or somesuch.

Really, since I run Fx, it's all the same to me. When I click on a .pdf "the right thing" happens (pdf.js loads it in-browser).
Comment 5 Bartosz Dziewoński 2013-08-15 20:39:00 UTC
This is a bad idea. pdf.js takes ages to load (it's simply a lot of code) and even longer to actually render anything (even optimized JS is not fast enough).

Not everybody in the world has access to the same technology, and what is fine in San Francisco might not be appropriate in Eastern Europe or Africa.

(Although I admit, once it finally loaded after a few minutes I was pleasantly surprised by its responsivity in the demo you linked.)
Comment 6 Tomer Cohen 2013-08-15 20:52:06 UTC
(In reply to comment #5)
> This is a bad idea. pdf.js takes ages to load (it's simply a lot of code) and
> even longer to actually render anything (even optimized JS is not fast
> enough).
I'm not sure what you're talking about. Here it loads quite well, and note that if Wikipedia will have better caching directives than the limited caching available on Github - you can get it loads its own resources from the local browser cache so only the first document will be delayed because of downloading the viewer code. 

> Not everybody in the world has access to the same technology, and what is
> fine
> in San Francisco might not be appropriate in Eastern Europe or Africa.
Only in case they are having an old and outdated browsers, or a very limited Internet access. Also note that pdf.js should work well on mobiles, which doesn't always have native PDF viewer installed, making reading PDF documents very challenging (and Google's online PDF viewer, which they link from their own applications sucks). 
 
> (Although I admit, once it finally loaded after a few minutes I was
> pleasantly
> surprised by its responsivity in the demo you linked.)

Please also remember that since it is HTML-based, it should be possible to manipulate with the viewer user interface, making its look and feel be in the same theme as Mediawiki. 

(In reply to comment #4)
> Personal opinion: this might be good as a "both" type solution. PdfHandler
> for
> the current situation that some like (and can be incrementally improved upon,
> UI-wise) in addition to a "view full PDF in-browser" or somesuch.
> 
> Really, since I run Fx, it's all the same to me. When I click on a .pdf "the
> right thing" happens (pdf.js loads it in-browser).

Demo of current PDF extension implementation over Wikipedia: https://commons.wikimedia.org/w/index.php?title=File%3AMultilingual-commons.pdf

The download link is located below the page image and labeled "Full resolution" instead of something more meaningful. 



Steps to reproduce:
a. Click on the following link: https://commons.wikimedia.org/w/index.php?title=File%3AMultilingual-commons.pdf
b. Try to read the few slides in the presentation. 


Actual result:
Note that you have to load another page in order to read the next slide, which contain only few words each.


Expected result:
Read the whole slides content without page reloads, see it in a full screen/presentation mode for best results. 

Try to open the PDF file in Firefox. Make sure that Firefox is set to preview PDF files in the browser (Options/Preferences → Applications → Search: PDF → Preview in Firefox).
Comment 7 Bartosz Dziewoński 2013-08-19 07:47:08 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > This is a bad idea. pdf.js takes ages to load (it's simply a lot of code) and
> > even longer to actually render anything (even optimized JS is not fast
> > enough).
> I'm not sure what you're talking about. Here it loads quite well, and note
> that
> if Wikipedia will have better caching directives than the limited caching
> available on Github - you can get it loads its own resources from the local
> browser cache so only the first document will be delayed because of
> downloading
> the viewer code. 

Define "here"? I live in Poland and use a 2005 laptop on a 1 mbps connection. The performance of the demo was not awful, but not smooth either, and I'm certainly not at the very end of a spectrum.


> Demo of current PDF extension implementation over Wikipedia:
> https://commons.wikimedia.org/w/index.php?title=File%3AMultilingual-commons.
> pdf
> 
> The download link is located below the page image and labeled "Full
> resolution"
> instead of something more meaningful. 

Yeah, good point. I filed separate bug 53017 about this.


> Steps to reproduce:
> a. Click on the following link:
> https://commons.wikimedia.org/w/index.php?title=File%3AMultilingual-commons.
> pdf
> b. Try to read the few slides in the presentation. 
> 
> 
> Actual result:
> Note that you have to load another page in order to read the next slide,
> which
> contain only few words each.
> 
> 
> Expected result:
> Read the whole slides content without page reloads, see it in a full
> screen/presentation mode for best results. 

This has actually just been fixed (and will be deployed soon). Page reload is no longer necessary, but the interface isn't perfect yet. See bug 40207.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links