skip to navigation
skip to content

wildcard.pdfpal 0.7b6

PDF Thumbnail generation, OCR indexing and extra views integrated with


This package provides some nice integrations for PDF heavy web sites.

  • Generates thumbnails from PDF
  • Adds folder view for pdfs so it can use the generated thumbnail
  • Adds OCR for PDF indexing
  • Everything configurable so you can choose to not use thumbnail gen or OCR
  • Ability to create searchable PDFs with HOCR
  • use the @@async-monitor url to monitor asynchronous jobs that have yet to run


OCR requires Ghostscript to be installed and Tesseract. Just you package management to install these packages:

# sudo apt-get install ghostscript tesseract-ocr

This will install tessact 2 not tesseract 3.

Searchable PDFs

Requires svn checkout of tesseract version 3.01 or 3.00 with the hocr configuration in place. Take a look at this thread to find out how to configure hocr

In addition, you’ll need exactimage and pdftk installed

# sudo apt-get install exactimage pdftk libtiff-tools

To not use the latest tesseract version to will have to add this in your instances declaration:

environment-vars += AUTHORIZE_OLD_TESSERACT_VERSION true

Plone 3

  • Requires hashlib


You can convert all at once by calling the url @@queue-up-all.


0.7b6 ~ 2012-04-20

-fix uninstall

0.7b5 ~ 2012-04-19

  • do not run conversion if documentviewer is installed [vangheem]
  • add better uninstall support [vangheem]

0.7b4 ~ 2012-04-09

  • fix image url for album view. [vangheem]

0.7b3 ~ 2012-04-05

  • fix content type spec for thumbnail response [vangheem]
  • display image thumb urls in in album view [vangheem]

0.7b2 ~ 2011-04-12

  • more checks on reading files [vangheem]
  • provide button to manually index document [vangheem]
  • add ability to split pdf up into multiple PDFs [vangheem]

0.7b1 ~ 2011-01-06

  • fixes for quality and size issues [vangheem]

0.6b2 ~ 2011-01-04

  • fix async monitor view to work with = 1.0 It changed the order of some args in the job. [vangheem]

0.6b1 ~ 2011-01-04

  • added ability to make PDFs searchable and make it work seamlessly if wc.pageturner is installed so flex paper is created with the searchable PDF version.

0.5b5 ~ 2010-12-07

  • did not conditionally import

0.5b4 ~ 2010-12-06

  • better info on async monitor
  • only reindex searchabletext when doing OCR so the modification date on the object does not get set.
  • make sure to catch exceptions so it doesn’t leave around files after a bad conversion
  • add colorbox for pdf folder view

0.5b3 ~ 2010-12-02

  • add ability to queue up all pdf files

0.5b2 - 2010-12-02

  • fix async monitor view

0.5b1 - 2010-12-02

  • Initial release
File Type Py Version Uploaded on Size (md5) Source 2012-04-20 88KB