[reportlab-users] Spreadsheet Table

Henning von Bargen H.vonBargen at t-p.com
Wed Feb 24 04:06:24 EST 2010


Tomasz Swiderski wrote:


> This Longtable stuff seems very strange to me. I understand how it works

> but I have no idea why it is used? In old implementation Table instance

> calculate rowHeights, spanRanges, nosplitRanges etc. each time Table

> split because Table's data change. Longtables stuff prevents unnecessary

> rowHeights calculations ? it stops calculations when it detects that no

> more rows will fit into current page (so there is no sense to calculate

> more row heights). It makes sense because calculated rowHeights are not

> reused. But why they are not reused in first place?



> It is easier to

> calculate all rowHeights and pass them to splited Tables in _splitRows

> method ? this way Table have to calculate row heights only once and

> Longtable optimization stuff is unnecessary.



> I can guess intention of this repeated row height calculation method. If

> table have variable width elements like paragraphs it is possible to

> shrink table widths if next frame is not so wide like previous. If

> column widths shirks, heights must be recalculated.


Robin Becker wrote:

> As for the generic discussion I believe the original Table was coded for ease of

> the coder and not with any thought about speed etc etc. The long table patches

> were applied to allow for a representative sample of the rows to be used for

> calculating widths. However, it is still fairly inefficient. As I understand it

> we ought to compute the dimensions of each cell up front ie with the minimum

> possible work. The table ought then to mutate into a class which never computes

> anything, but is merely a view onto the computed rows/columns.

> Splitting of a view ought to be easy.


@Tomasz:

Your guess about the intention is correct.
Theoretically, there may be cases where the available width of the next frame
differs. Since row height depends on column width, and column width may depend
on available width, the calculated row heights may be invalid.

However, I didn't see a real world example for this, so it should be safe to
assume (at least for a long table!) that the width of all frames is equal.

Note: for paragraphs, the situation is certainly different; the wordaxe paragraph
class uses an internal cache and checks to see if it's valid by comparing the
available width with the previous one.

This approach could be adopted to the table class in order to handle even this
theoretical case in a correct and more or less efficient way.

@Tomasz and Robin:

I doubt it is a good idea to precompute the table as a whole in the general case.

My original use case was for tables containing paragraphs (or other "complex"
flowables).
Computing the paragraph layout is a costly operation:
* It needs quite a lot of computation, especially if the paragraph uses automatic
hyphenation or kerning, which is possible with wordaxe [1].
* it creates a LOT of objects as the result of the line-breaking algorithm.

IIRC the original LongTable considered most of the cells only once,
except for one row per page.

If I understand your approach correctly, you want to compute the heights for the
whole table once at the beginning.
In the case of paragraph table cells, this means that you compute all paragraphs
heights (which _may_ increase memory consumption significantly, but I'm not sure)
and that each paragraph is considered twice:
* first time for the height computation,
* second time for the actual "rendering".

So, please do also compare the performance of tables where each cell contains
a *non-trivial* paragraph and the table has many rows.

Note that using paragraphs for the cell values is a common use-case,
even if your table contains only query results from a database.
That's because a paragraph gives you automatic line-breaking and other features
you often need.

Henning

[1] http://deco-cow.sourceforge.net/



More information about the reportlab-users mailing list