[reportlab-users] Writing smaller image-only PDFs

Nicholas Watmough nickw at deakin.edu.au
Fri Feb 10 02:52:56 EST 2006


I tried creating the PDF using PNGs. When saved, the PNGs were 
approximately 100KB or less each. But when put into a PDF,. the PDF was 
approximately 2MB. GIFs were about 800KB each, but the PDF was still 2MB.

Not sure why approx. 1.1MB of PNGs create a 2MB PDF. Seems something is 
going on.

Not sure what else I can try.

Nathan wrote:
> On 2/9/06, Chris Jerdonek <jerdonek at gmail.com> wrote:
>   
>>> Date: Thu, 09 Feb 2006 17:27:50 +1100
>>> From: Nicholas Watmough <nickw at deakin.edu.au>
>>> Subject: Re: [reportlab-users] Writing smaller image-only PDFs
>>> To: Support list for users of Reportlab software
>>>       <reportlab-users at reportlab.com>
>>> Message-ID: <43EAE0E6.1010804 at deakin.edu.au>
>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>
>>> Each JPEG is about 0.5MB, so the combined size would be about 5.5MB.
>>>
>>> However, I tried saving the JPEGs individually through the commercial
>>> tool (Omnipage), and the JPEGs were the same size as when saved through
>>> Python. But the imge-only PDF produced by Omnipage was 0.4MB, and the
>>> one produced through reportlab was 7.8MB.
>>>       
>> How big is the Omnipage file if you first save them as JPEGS and then
>> create the Omnipage PDF from that?  I bet it will also be big.  It
>> sounds like all the compression is happening in the scanning process.
>>
>> --Chris
>>     
>
> You guys are driving me nuts, beating around the issues.  Class is now
> in session:
>
> JPEGs store full color information - always.  The lossiness vs.
> quality of a JPEG file is configurable, but the resulting file is
> about as compressed as the file can get while preserving the chosen
> amount of full-color information.  You can't take something that's
> already been compressed, and compress it much further with a lossless
> format [compressed formats look like random bits, and the more random
> a bit stream is, the less you can compress it without losing info]. 
> The JPEG format is optimized for full-color photographs, and store the
> full-color information in a highly compressed format.  It's really,
> really bad [file-size-wise] for images of mostly one color.
>
> Bitmaps (BMP's) are a simple, lossless format.  You can set bitmaps to
> accept only grayscale or only black & white, which will affect file
> size a bit.  Bitmap files aren't usually compressed at all!  The
> format is basically....
> "WHITEPIXEL WHITEPIXEL WHITEPIXEL WHITEPIXEL BLACKPIXEL WHITEPIXEL
> WHITEPIXEL WHITEPIXEL BLUEPIXEL etc."
> External compression programs can compress bitmaps significantly. Even
> compressed, the bitmaps are much larger than some other image formats
> since they preserve all information for every pixel separately.
>
> PDFs (I know less about PDF internals than image format internals), as
> far as I understand, just embed whatever you give it, be it text,
> image files, or vectors--although it supports a basic lossless
> compression algorithm that it can use internally.  As far as I know,
> any specific "image compression", especially lossy image compression,
> will have to be done before you give the image to the PDF-generator. 
> JPEGs won't compress hardly at all with a lossless algorithm, because
> they are already compressed.  BMPs will benefit hugely from lossless
> compression, because they aren't hardly compressed at all to begin
> with.
>
> In the case originally mentioned in this thread, it's extremely likely
> that omnipage is _not_ just a pdf generator.  First, omnipage is
> converting the image to a different image format (2-color GIF, for
> example), which throws out the color information, and is highly
> compressed.  The small images from the image conversion subcomponent
> are then given to whatever subcomponent generates the PDFs, and voila!
>  Small PDF.
>
> Convert your own images to something nice and small, and reportlab
> ought to generate a small PDF for you.  End of story.
>
> Yes, this is a high-level view,  Yes, I could sure be wrong on some
> detail specifics (feel free to correct me if you _know_).  No, I'm not
> trying to offend anybody.  Your mileage may vary.  Void in the
> following states: denial, insanity, police.
>
> ~ Nathan
> _______________________________________________
> reportlab-users mailing list
> reportlab-users at reportlab.com
> http://two.pairlist.net/mailman/listinfo/reportlab-users
>
>
>   


More information about the reportlab-users mailing list