[reportlab-users] Writing smaller image-only PDFs
nickw at deakin.edu.au
Wed Feb 8 23:09:22 EST 2006
I am trying to scan a number of pages, then write the output to a PDF
using reportlab. The size of the PDFs generated is much larger than
would seem necessary, but I'm not sure why. I've tried to reduce the
file size, but it doesn't seem to work.
I am using the Python TWAIN module to scan the images, which passes the
images in BMP format. I use the Python Image Library to open the BMP,
and write the PIL object to the PDF using drawInlineImage(). I tried
changing to using drawImage(), which required me to wrap the image in an
ImageReader object, but this did not decrease the output PDF file size.
The produced PDF file size was approx 2MB.
I tried to reduce this by using JPEGs. So I saved my BMPs into JPEGs
(using an StringIO class), then reopened the JPEGs using the PIL, and
wrote the PIL object to the PDF using drawInlineImage().
The resulting PDF file size was 7.8MB.
When I turned on page compression, the file size was reduced to 6.8MB.
The PDF I am generating only has 11 pages (11 images).
When I try doing the same thing using a commercial tool (Omnipage), to
do the scanning and production of the image-only PDF, the resultant file
size is 0.4MB.
While I realise that an open source tool may not be able to achieve the
same reduction level as a commercial tool, the file sizes I am getting
using Python seem too large. Particularly as I am getting larger output
for JPEGs than I am for BMPs.
Does anyone know how I can reduce the file size of my produced PDFs? I
suspect I may be doing something wrong with the JPEGs, but not really sure.
More information about the reportlab-users