Last modified: 2011-01-25 01:05:28 UTC
Since a few days trying to purge large djvu file fail, so the text layer of djvu file is not accessible. After a successful purge, creating a page should show the text layer here: http://fr.wikisource.org/wiki/Livre:Croiset_-_Histoire_de_la_litt%C3%A9rature_grecque,_t4.djvu
The bug can be seen for the following files: http://commons.wikimedia.org/wiki/File:Burnouf_-_Lotus_de_la_bonne_loi.djvu http://commons.wikimedia.org/wiki/File:Croiset_-_Histoire_de_la_litt%C3%A9rature_grecque,_t4.djvu http://commons.wikimedia.org/wiki/File:Michaud_-_Biographie_universelle_ancienne_et_moderne_-_1843_-_Tome_10.djvu the first two of them were uploaded recently, and the djvu text layer has not been successfully extracted, because of this bug; or maybe it is the text layer extraction that causes the bug. the last file was uploaded a long time ago, and at that time the file could be purged, so the djvu text was successfully extracted; it is thus still available in the metadata. I tested the first file on my machine, with a recent mediawiki install and it worked fine: the file can be purged and the text layer is correctly extracted.
Also happened to http://commons.wikimedia.org/wiki/File:Uppslagsbok_f%C3%B6r_alla_1910.djvu
Probably the same issue: http://en.wikisource.org/wiki/User:Billinghurst
When I try to "purge" the large djvu file from Commons, I get an HTTP 500 internal server error response after exactly 30 seconds. Why is purge taking so long? It should just remove old stuff (supposedly a quick operation), and then schedule a queued job for reindexing (a slower operation, depending on the job queue length).
Purge takes time because of the djvu text layer extraction. The bug should be fixed in r61258.
Reopening this bug because the fix is not live.
Fix has been deployed, but purging still doesn't work.
Purging work fine now, Bryan, what File: fails to purge for you ?
I got a 403 "Wikimedia has an error" error page trying to purge <http://commons.wikimedia.org/wiki/File:Uppslagsbok_f%C3%B6r_alla_1910.djvu>. Presumably it times out, because it takes a long while to load the page.
it works for me.