[reportlab-users] Experimental early serializing pdfdoc.py for

Fri Apr 15 07:50:20 EDT 2005

Thomas Blatter wrote:
....
I don't really mind who cleans up the patch. It's almost workable now, but needs 
some refinement. Is the way to do this to make a branch and allow you write access?

There are a number of issues related to compatibility. First off we need to know 
how to deal with multiple pass documents and the various filetypes. I think it's 
simplistic to assume we always have true file like objects; writing to a socket 
doesn't allow tell and truncate etc etc.

Anyhow so far as I can tell with minor adjustments one can pass all our tests. 
Linearised PDF is the target for efficiency. There's no point moving the trailer 
it has to go at the end. Indeed there may be more than one if the document is 
incrementally updated. The crossref is always indirect, but the Adobe docs 
indicate it should probably go just prior to the associated xref.

I would like to ask whether the early serialized version with a StringIO file is 
equivalent in speed/memory usage to the existing late serializing version. Can 
we simulate the late serializer with the early one + a special file (ie a list) 
and some extra dictionaries etc etc? That would be a best of both worlds approach.

As for an API I guess we'd need to allow this to be decided at run time; the 
implication being that we'd need some kind of pdfdoc object. A module can do as 
a start.

As for postscript output; we already have a postscript graphics renderer and 
have had some success with a canvas adapter that allows standard PDF canvas 
commands to be converted into PS (or EPS). Since all our software writes to the 
canvas it seems easier to substitute further up the information flow rather than 
create a different PDF backend.
-- 
Robin Becker