[reportlab-team] [reportlab-users] reportlab and CMYK images, take 2

Sun, 5 Oct 2003 21:27:54 +0100

> Ok, I've done some research and testing, and I think I'd like to do the
> following:
>
> - rely on PIL as much as possible. That means for instance not to use
> pdfutils.readJPEGInfo().

Here are two problems.  First, there are many times when a C compiler
is not available at a customer site. We at least have the ability to
handle JPEGs natively at the moment.

Second, we're adding Java support.  The current CVS code can
import images in Jython, wrapping up java.awt.image instead
of PIL.  Whatever we add, we ought to make sure we add
in the Java wrapper too.

In other words, PIL should be completely encapsulated.

> When testing the current code, I found out that some CMYK jpgs
> would come out inverted in the pdf. Funnily this is a well know Acrobat
> quirk since a long time, and even documented by the JPEG group, and the
> PIL guys have a workaround for that, but readJPEGInfo isn't capable of
> extracting the needed information.
>
> - use PIL to analyze a given image to find out what format it has (i.e.
> jpg, tif, gif, etc.), and code spezialized "converter" methods, where
> possible, which return, uhm, something.
> At first I think I'll settle with the tuple (imagedata, imgwidth,
> imgheight), like jpg_imagedata() and PIL_imagedata are doing now. But I
> feel that should be more generalized, i.e. the pdf object header should
> be built later in the process.

> In that context, what is PDFImage.format() about?

This writes out the image object in the format PDF requires.
Not just the stream of compressed pixels, but also the dictionary
beforehand describing the color space, bits per pixel,
width, height etc.  All our objects have a format() method
which writes out a stream of stuff to go in the PDF file.

(A PDFDocument object has to be passed as an argument so
that objects involving cross-references can be resolved,
but it makes no difference to images).

>
> - In addition to the specialized "converter" methods, offer a generic
> converter method, which uses PIL to convert the image to "raw" format,
> but preserves the seperation (i.e. never convert from CMYK->RGB or RGB
> ->CMYK). This conflicts with what you wrote above, but I don't think
> it's wise to offer that conversion.
> RGB->CMYK is  dependend on the output medium, and Acrobat Reader is
> capable of displaying CMYK images, so I don't see the need for conversion.

I understand that you want to read in CMYK images and display
them as such, which makes a lot of sense.  But I think it's
also strange to let people mix RGB and CMYK models in
one document.  If we moved to proper support for professional
printing, I think it might be better to 'declare a color palette'
somehow.  So if you say you are doing a CMYK document,
all images and colors get converted to CMYK, or you are only
permitted to use those colors.

> - Mid term goal: Unify image XObjects and inline images.

What do you mean?  One method, with an argument to say if
it goes inline or externally?  That would make sense although
a lot of the code to produce them is shared already.

Thanks,

Andy