[reportlab-users] Writing smaller image-only PDFs
Nicholas Watmough
nickw at deakin.edu.au
Fri Feb 10 02:52:56 EST 2006
I tried creating the PDF using PNGs. When saved, the PNGs were
approximately 100KB or less each. But when put into a PDF,. the PDF was
approximately 2MB. GIFs were about 800KB each, but the PDF was still 2MB.
Not sure why approx. 1.1MB of PNGs create a 2MB PDF. Seems something is
going on.
Not sure what else I can try.
Nathan wrote:
> On 2/9/06, Chris Jerdonek <jerdonek at gmail.com> wrote:
>
>>> Date: Thu, 09 Feb 2006 17:27:50 +1100
>>> From: Nicholas Watmough <nickw at deakin.edu.au>
>>> Subject: Re: [reportlab-users] Writing smaller image-only PDFs
>>> To: Support list for users of Reportlab software
>>> <reportlab-users at reportlab.com>
>>> Message-ID: <43EAE0E6.1010804 at deakin.edu.au>
>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>
>>> Each JPEG is about 0.5MB, so the combined size would be about 5.5MB.
>>>
>>> However, I tried saving the JPEGs individually through the commercial
>>> tool (Omnipage), and the JPEGs were the same size as when saved through
>>> Python. But the imge-only PDF produced by Omnipage was 0.4MB, and the
>>> one produced through reportlab was 7.8MB.
>>>
>> How big is the Omnipage file if you first save them as JPEGS and then
>> create the Omnipage PDF from that? I bet it will also be big. It
>> sounds like all the compression is happening in the scanning process.
>>
>> --Chris
>>
>
> You guys are driving me nuts, beating around the issues. Class is now
> in session:
>
> JPEGs store full color information - always. The lossiness vs.
> quality of a JPEG file is configurable, but the resulting file is
> about as compressed as the file can get while preserving the chosen
> amount of full-color information. You can't take something that's
> already been compressed, and compress it much further with a lossless
> format [compressed formats look like random bits, and the more random
> a bit stream is, the less you can compress it without losing info].
> The JPEG format is optimized for full-color photographs, and store the
> full-color information in a highly compressed format. It's really,
> really bad [file-size-wise] for images of mostly one color.
>
> Bitmaps (BMP's) are a simple, lossless format. You can set bitmaps to
> accept only grayscale or only black & white, which will affect file
> size a bit. Bitmap files aren't usually compressed at all! The
> format is basically....
> "WHITEPIXEL WHITEPIXEL WHITEPIXEL WHITEPIXEL BLACKPIXEL WHITEPIXEL
> WHITEPIXEL WHITEPIXEL BLUEPIXEL etc."
> External compression programs can compress bitmaps significantly. Even
> compressed, the bitmaps are much larger than some other image formats
> since they preserve all information for every pixel separately.
>
> PDFs (I know less about PDF internals than image format internals), as
> far as I understand, just embed whatever you give it, be it text,
> image files, or vectors--although it supports a basic lossless
> compression algorithm that it can use internally. As far as I know,
> any specific "image compression", especially lossy image compression,
> will have to be done before you give the image to the PDF-generator.
> JPEGs won't compress hardly at all with a lossless algorithm, because
> they are already compressed. BMPs will benefit hugely from lossless
> compression, because they aren't hardly compressed at all to begin
> with.
>
> In the case originally mentioned in this thread, it's extremely likely
> that omnipage is _not_ just a pdf generator. First, omnipage is
> converting the image to a different image format (2-color GIF, for
> example), which throws out the color information, and is highly
> compressed. The small images from the image conversion subcomponent
> are then given to whatever subcomponent generates the PDFs, and voila!
> Small PDF.
>
> Convert your own images to something nice and small, and reportlab
> ought to generate a small PDF for you. End of story.
>
> Yes, this is a high-level view, Yes, I could sure be wrong on some
> detail specifics (feel free to correct me if you _know_). No, I'm not
> trying to offend anybody. Your mileage may vary. Void in the
> following states: denial, insanity, police.
>
> ~ Nathan
> _______________________________________________
> reportlab-users mailing list
> reportlab-users at reportlab.com
> http://two.pairlist.net/mailman/listinfo/reportlab-users
>
>
>
More information about the reportlab-users
mailing list