[reportlab-users] Is ReportLab suitable for this task?

reportlab-users@reportlab.com reportlab-users@reportlab.com
Fri, 16 Jan 2004 07:28:51 -0900


I'd like some advice as to whether ReportLab would be suitable for a
statement generating requirement at my company.

My company sends periodic (monthly, quarterly, yearly) statements to its
members.  We do a batch of statements weekly (approximately) and each batch
consists of about 30,000 statements.  Each statement is between 1-4 double
sided pages consisting of primarily of text, shaded boxes, and tables.
There is also one small repeated logo graphic and a bar code to assist the
sorter/stuffer.  We need to produce a paper report to send via mail and a
file per statement that can be stored in our COLD/imaging system for
research purposes.  The production of the file for the paper report needs to
take no longer than two hours (about 3 statements per second)

It's a requirement of the printing process that the statements come batched
together in about 5000 page files but the COLD/imaging system requires a
file per statement.

For various reasons, I'd like to replace our current process (which is
described below for anyone who is interested).  The core business
application supplies two files for statement purposes -- one is a formatted
ASCII report and the other is a 'data only' format.  Our current process
(see below) uses the ASCII format but I'd like to change to using the 'data
only' file.

I'd also like to produce just one format, PDF, rather than a PCL file and an
ascii file.

I've done a little work with parsing the data only format in Python and I
think that I can parse about 30 statements per second using a 450 Mhz
Pentium.  In addition, each statement will require several calls to SQL
Server.  I haven't modeled that part yet but let's say that doubles the
time. (15 statements per second).  The production machines would be faster
but for the purposes of this discussion, let's use that number.

So, a couple of questions.

1)  Will the process of formatting the data to PDF in ReportLab cause the
time to generate a statement go beyond 1/3 of a second?  Probably a tough
question without more specifics, but what specifics would you need?

2)  We need two kinds of outputs, 1) a file for the printer, and 2)
individual PDFs for import into the COLD/imaging system.  For the printer
file, we need a file with enough statements to use 5,000 pieces of paper (so
the operators can just load a box of paper per file).  Could the printer
file be just one 5000 page PDF or does that become too large for ReportLab
to handle in memory?  How would you handle this?

3)  We use an IBM Infoprinter which says that it supports PDF.  Does anyone
have experience with this printer and PDF?  Will using a PDF format rather
than PCL slow things down inordinately?

Thanks for your attention and advice.

Don



Current Process

Currently this is done with a peculiarly arcane approach using a reporting
package called Optio.  Our (3rd party supplied) core business application
supplies a statement file which is good enough for most of its customers.
However, we recently converted to this application and thought that our
members would not be happy with the rather crude look of the ascii text
statement compared to what we used to give them.  In addition we wanted to
supply some info that was not in the statement as supplied.  So we bought
Optio and trouble. We start with the statement file containing 30,000
statements that the core business app supplies.  We also extract some extra
data from the core app and load it into SQL Server.  Optio parses the
statement file to find the various pieces/parts of a member statement,
determines which member it's for, gets the extra data it needs from SQL
Server reformats the data prettily for a PCL printer file and reformats it
differently for the ASCII file output.  This all works but it required quite
a bit of professional services time to get started and we devote a large
percentage of one of our programmers to maintain it.  Because different
parts of the Optio program produce the PCL and ASCII format and because the
parsing process is required to work on an input page by input page basis,
every change requires a great deal of testing.  A consequence of this setup
is that if a member asks for a copy of the their statement, we can't give
them something that looks like the pretty piece of paper that they got in
the mail, we have to give them the ugly ascii version.