[reportlab-users] Patch: PIL images with ReportLab
robin at reportlab.com
Thu Apr 28 05:53:09 EDT 2005
Sam Hunter wrote:
> Yeah, I think it should be okay.. but I"m not a Java kid by any means :)
> What I'd really like to do is fix the canvas.drawImage() function so
> that it would just accept PIL images.
> It looks like parts of canvas were originally designed to handle PILs,
> but that functionality was broken when the drawImage() function was
> written because it does MD5 sums, and needs the rawdata to do the MD5
> sum with.
> I fixed that problem (canvas.py lines 564-575), but that has exposed
> another issue.
> The canvas,drawImage() function calls
> PDFImageXObject.loadImageFromSRC() function which seems to be built for
> loading PIL Image object, but it is a little messed up because it tries
> to access their format information with
> Normal PIL Image objects don't have a "_image" member, and that
> ".format" is only valid if you haven't changed the PIL image in any way
> (a resize sets it to Null).
> I am currently working on a way to have the PIL image convert itself to
> something so that the line "self.loadImageFromJPEG(fp)" (pdfdoc.py line
> 1827ish) doesn't even need to be run. As far as I can tell, all of the
> self.* members that loadImageFromJPEG() sets are available from various
> PIL information functions, and I would imagine that the
> self.streamContent could probably be generated with some kind of PIL
> function as well. If not, the data in imageFile could.
> Does anyone know if these functions are used for things other than PIL
> image types? Has this path through the code been broken at the
> beginning for so long that it is just way out of date, or was it broken
> to make something else work?
> thanks :)
The ImageReader object is the only nn filename image we're supposed to be
passing into the back end. We should not be passing raw images around and
handling them with special case code everywhere. That's why we have the _image
thing inside it is supposed to be private. The reason for the separate load from
functions is that we have potentially different sources.
First off JPeg is native for PDF and is used even when PIL is not available.
Thus we have a fake attempt to split files on the extension and use
loadImageFromJPEG; converting a JPEG to RGB or CMYK is non-trivial.
For historical reasons we still attempt to support prebuilt .a85 files which
have been built into a PDF stream format in some way and are available when PIL
isn't that's via loadImageFromA85.
Otherwise the path is via loadImageFromSRC (which should really be called
loadImageFromImageReader). If any encapsulation is to be done I prefer that it
be restricted to the ImageReader class.
I strongly oppose removing the special JPEG handling. PIL doesn't exist
everywhere and we would be foolish to rely on it. The same is true of attempts
to get PIL to do the formatting into PDF.
Allowing the ImageReader class to accept some specific image object instances is
So far as I know we are using
I'm pretty sure the latter two are only used in a hackish attempt to use a PIL
opened JPEG in native form. The reason for that is as follows; if we don't use
the native version we have to do a conversion to either RGB or CMYK etc. Of
course if what you really want to do is read a PIL image do conversions and then
use the result via RGB conversion then this hack is wrong.
The real pain is that we currently have similar approaches in two places ie
pdfdoc.py/pdfutils and also pdfimages.
More information about the reportlab-users