[reportlab-users] Patch to support tables with oversize cells

Tue Jan 11 14:55:03 EST 2022

I have implemented a new feature of tables, that enables tables which
have cells that are too tall to fit on one page to still be rendered to
PDF,
which is a behavior that would make things a lot easier for us, as we
sometimes have tables that have long texts in them.

See description below.

//Lennart Regebro, Shoobx

== splitInRow patch ==

Currently, if a table is too tall to fit on one page and has the attribute
splitByRow set to 1 (which is the default), the table will be split into
two
tables and the rows that do fit on the current page will be rendered on the
current page. The remaining rows will be printed on the second page. If the
remaining rows do not fit on that page, the table will be split again in the
same manner until the end of the table.

However, should one cell be too large for one page, the table can not be
split,
and the rendering to PDF will fail. This will also happen if the splitting
is
disabled by setting the splitByRow attribute to 0, and the table as a whole
is
too large for a single page.

This patch implements functionality that will split a row that is taller
than a page into two rows, enabling the rendering of tables with very tall
rows.

It does this by adding a new attribute, splitInRow, which also defaults to
1.
This will if, and only if, a table fails to render in the allocated space,
split a row into two rows, also splitting all the content inside those rows
at
the appropriate point.

=== How to use it ===

The _splitRows() method of tables now takes a new argument, doInRowSplit,
which defaults to 0. The split() method of the table will now do the
following:

* If both .splitByRow and .splitInRow is 0, it will return [], ie fail.

* If .splitByRow is 1, it will call split() with doInRowSplit=False, to
attempt
  a split by whole rows. But if .splitByRow is 0 is will call it with
  doInRowSplit=True to attempt to split it inside a  row.

* If that fails, it will fall back to doing the opposite.

* If that also fails, the method will fail (ie return []).

This results in the following behavior:

* Setting both attributes to 0 will never split the table

* Setting .splitByRow to 1 and .splitInRow to 0 will, when a split is
necessary,
  only split between rows (this is the current behavior).

* Setting .splitByRow to 0 and .splitInRow to 1 will always split inside
rows,
  unless the split is too close to the start or end of a row that it can't
  split those cells, in which case it splits between those rows.

* Setting both .splitByRow and .splitInRow to 1 will first attempt to split
the
  table between rows, and only split a row if it is necessary to fit the
table
  on a page. This is the new default behavior.

=== Implementation details ===

Table._splitRow() will as before find the place to split the table as
variable n. But when doInRowSplit is set to one, instead of n being the
first row after the split, n will be the row that is split.

This necessitated a small change in the logic dealing with repeatRows, as
we otherwise might split in a repeat row.

If a row should be split, each cell is checked for the minimum and maximum
split points, which typically is one line of text, as we can't split inside
a line of text. The split point is then selected to be as high as possible
inside that.

After that, each cell is split into two at the height given by a new
method,
Table._splitCell().

This split looks at the valign set in the cell style and calculates where
inside the cell value the split will be, and splits at a row or flowable
value
before that.

When the valign of the cell is "MIDDLE", the margins of the new split
cells are adjusted to try to keep the contents in the "middle" of a merged
cell. If the cell contents were actually split, that's easy, the first one
is simply set to valign to the bottom, and the second to the top. But if the
value was not split, the margins of that cell is adjusted to try to get
the value to remain close to the original position.

After this, a new table T is created from the new data, and the adjusted
styles, line commands, etc. This is then used in the rest of the method,
which will split the new table into two tables, in the normal fashion.

==== Other changes ====

To get CellStyle.copy() to work, self.name is passed in to the copy, and
attributes that starts with underscore are skipped.

In Table._splitRows there was a section that would adjust the line commands
list to adjust for the new split. This section did need a few extra lines
for handling the splitting of a row, and it also needed to be called in
two different places, so it was moved to its own method,
Table._splitLineCmds().

Table._listCellGeom() now supports cells that have plain text content.
This could be used to simplify code in some places where it's used.

Tests were added in the test_table_layout.py module.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/reportlab-users/attachments/20220111/ae847de4/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: splitInRow.diff
Type: application/x-patch
Size: 27107 bytes
Desc: not available
URL: <https://pairlist2.pair.net/pipermail/reportlab-users/attachments/20220111/ae847de4/attachment-0001.bin>