[reportlab-users] Reportlab performance

Leszek Syroka leszek.marek.syroka at cern.ch
Wed May 12 10:36:22 EDT 2010


On 5/12/2010 3:45 PM, Andy Robinson wrote:

> On 12 May 2010 14:20, Leszek Syroka<leszek.marek.syroka at cern.ch> wrote:

>

>> Hello,

>>

>> Right now our main concern is performance of 'platypus'. Creating a

>> document, similar to attached one, but containing 1032 pages, about 1 200

>> 000 characters and no images takes about 12 minutes. Is it a normal time of

>> creating such document?

>>

> No, this is not normal. We construct the entire document in memory,

> but yours is simple text.

>

To be clear, I didn't mentioned the file I attached, but document
similar to it, but a way bigger. I tested the wikipedia plugin using
your library and creation time of a file containing almost the same
numbers of characters (of course with more sophisticated design and
containing images, but created on more powerful machine than my PC used
for development) took about 7,5min. Profiling show that great majority
of the time is spend inside the Reportlab library in
doctemplate->multibuild method.

> I emailed separately about the possibility of some commercial support

> but whichever way you want to do it, it would be useful is you could

> post some code here to show how you build up the Platypus document.

>

I can't give you an answer about commercial support right now, because
it has to be discussed with my section leader.

> Are you using a big table which spans many pages? This could cause

> problems and, looking at your document, it should not be necessary.

>

Only flowables I'm using are Paragraph, Spacer, PageBreak and single
TableOfContents.

>

>

>> Moreover I found out that to insert the table of contents document is

>> created in three iterations. First time to create a document with blank

>> table of contents, second one to put there and adjust dynamically created

>> one and third time to fit the table of contents containing more than one

>> page. Is it possible to make this operation only once?

>>

> The TableOfContents widget in Platypus does it 2x, or 3x as you say.

> If you can inspect your content 'up front' and work out how many

> sections there are and their titles, there will probably be a way to

> 'preload' the table of contents with this to eliminate at least one

> pass.

>

>

> There is also another technique we use in our commercial package using

> delayed 'Form XObjects'. We inspect the content up-front and make a

> table containing all the entries, which goes into the story. In the

> right column we insert a 'Form XObject reference' (canvas.doForm(...))

> saying 'draw the form "chapterXX" here', even though it isn't defined

> yet. Then, on the way through the document, the forms get defined.

> So single passes are possible. However, we have not packaged this

> up to make it easy to use in the open source package

>

>

>> I would also like to know how much quicker is using pdfgen library in favor

>> of platypus, which for our application is a bit to slow.

>>

>

> If you need to 'move down the page' and draw paragraphs, you will need

> Platypus, or something like it. However we have to eliminate any

> backtracking, memory wastage and multiple passes first.

>

>

> Best Regards,

>

>

>

Thanks for solutions
Leszek


More information about the reportlab-users mailing list