[reportlab-users] RE: Creating Large PDF Files

Robin Becker reportlab-users@reportlab.com
Thu, 6 Mar 2003 16:00:48 +0000


In article <336FB07A892C5B4FBDD0A738E4A7991F01380155@marexch01.csgsystem
s.com>, Engel, Gregory <Gregory_Engel@csgsystems.com> writes
......
One quick hack that I suggested some time ago is to attempt to make the
code object into a semi cached disk file. IE we need to make a list like
thing that mostly lives on disk when it gets above a certain size.

That way you will end up with most of the code on disk, but with most of
the structural info in memory. 

A quick hack to see if that is feasible is to change the code
declaration into something that drops off all but the last 1000 elements
(say) and see how much memory is required by your job. If that amount
will work it would then be possible to add on the other required object
operations properly

so the quick hack looks like

import UserList
class MYCODE(UserList):
    def __init__(self,L):
        UserList.__init__(self,L)
        self._count = 0

    def append(self,x):
        self._count += 1
        while self._count>=1000:
            del self.data[0]
        self.data.append(x)


the above will go wrong if backward searches are too large and probably
fail when the final join is done.


>-----Original Message-----
>
>(Sorry, hit the wrong key a moment ago and sent a blank reply).
>
>Currently we do always process in memory.  PDF files are full of cross
>references and it is impossible to write the file without knowing
>everything about it; it's not like HTML or text where you can write the
>beginning, the middle and then the end :-(.  So, we decided early
>on to create the whole thing in memory.  Another approach would have
>been to write each page to disk in a special format and assemble them
>at the end, but we have not implemented that.  We also decided
>that it should be possible to render, say, a 1000-page book
>on a well-specced PC, and that was good enough.
>
>The first thing to ask is, are you making efficient use of forms
>for any content which is common across many pages?
>I've seen repetitive apps (ten thousand customer statements)
>which could be done in 5Mb one way or 500Mb another way.
>
>The next thing is, is it cheaper to buy a small fistful of
>memory chips than to write code to
>
>The third is, if your app is that big, would it not be easier
>for everyone to split it into 'chapters'?  It's generally bad to
>hit print on a document more than 500 pages long as you are almost
>certain to have to change per trays in mid-printing?
>
>These are all 'excuses' and not solutions but worth asking your
>manager about.  If it's really critical, get back to me and I may
>be able to outline a solution, but it will need some work...
>
>- Andy
>_______________________________________________
>reportlab-users mailing list
>reportlab-users@reportlab.com
>http://two.pairlist.net/mailman/listinfo/reportlab-users

-- 
Robin Becker