[reportlab-users] Re: Large tables - exponentially increasing run times

Robin Becker reportlab-users@reportlab.com
Mon, 17 May 2004 12:09:43 +0100


Craig Ringer wrote:
......
> 
> The main thing that confuses me is why the Table instance being split
> creates an entire new Table instance rather than trimming data (and
> styles, etc) off its self and returning [R0, self]. I presume it comes
> down to the readability and maintainability issues mentioned above.

The answer to this is fairly complex. First off the lead table and its first 
split offspring need to coexist. In an ideal world we could modify the original 
table in place, but this is python and if we're doing multi-pass platypus then 
we would be altering the original data and thus confusing the second pass.

Of course there is only one original table so there is no point in keeping the 
original when we come to split the second and later parts of a large table. That 
way at least we only double the number of rows held. I'm fairly sure that 
optimization hasn't been done so in practice we're holding considerably more.

eg if N=original number of rows, p the rows per page and n the number of pages
we get

Page Tables     Display        Rows
1:   T0 [T1,T2]   T1           2N
2:   T2 [T3,T4]   T3           2(N-p)
3:   T4 [T5,T6]   T5           2(N-2p)
.....
n:   T[2*(n-1)]   T[2*(n-1)]   2(N-(n-1)p)

In each row k except the last we have the number of rows is twice the number in 
the first column table and for these the number decreases by the rows per page.

I make that 2 (N +p ((n-1)*n)/2)) = 2N + p*(n-1)n which if I make the 
approximation pn ~ N gives us N(2+n-1). Which could be much larger than what is 
required. Plus we have 2n-1 tables.

If we make the optimization mentioned above then we only double the rows in the 
first split.

Page Tables     Display        Rows
1:   T0 [T1,T2]   T1           2N
2:   T2 [T2,T3]   T3           0 counted in row 1
3:   T3 [T3,T4]   T5
.....
n:   T[n]         T[n]         0

so here we have 2N rows & n+1 tables. Clearly this optimization is worth making.
-- 
Robin Becker