[reportlab-users] Writing smaller image-only PDFs
Robin Becker
robin at reportlab.com
Thu Feb 9 05:11:58 EST 2006
Nicholas Watmough wrote:
> Each JPEG is about 0.5MB, so the combined size would be about 5.5MB.
>
> However, I tried saving the JPEGs individually through the commercial
> tool (Omnipage), and the JPEGs were the same size as when saved through
> Python. But the imge-only PDF produced by Omnipage was 0.4MB, and the
> one produced through reportlab was 7.8MB.
>
> Maybe there is some way to reduce the JPEG file size?
>
> Nick
>
>
Could it be your docs are only black/white? A clever tool might recognize that
and do the appropriate image manipulation. I'm fairly sure we try to respect the
image properties ie check for gray/rgb/cmyk so we don't.
Since jpeg is native for pdf we use only ascii85 encoding to make the contents
more like ascii. I think we could save a bit by not doing that, but not a huge
amount. Jpegs are already compressed and we have to specify dctdecode as well in
the image filters.
Perhaps they're tweaking the jpeg parameters to allow something smaller.
Alternatively a smart scanner tool could actually do OCR, but I suspect they
don't unless you ask for it.
Have you tried extracting the images from the omnipage output to see how they
compare with the inputs?
--
Robin Becker
More information about the reportlab-users
mailing list