[reportlab-users] Crash when generating large PDF (100+ pages)
robin at reportlab.com
Thu Jun 9 04:53:31 EDT 2011
On 08/06/2011 12:17, Jens Ådne Rydland wrote:
> I recently got a strange crash deep down in reportlab.lib.xmllib,
> traceback is:
> File "/www/sites/[domain-name-removed]/views/pdfexport.py", line 658, in make_fa_pdf
> qdata.append([Paragraph(u"<b>%s</b>" % qa.question, styleN),\
> File "/usr/lib64/python2.4/site-packages/reportlab/platypus/paragraph.py", line 798, in __init__
> self._setup(text, style, bulletText, frags, cleanBlockQuotedText)
> File "/usr/lib64/python2.4/site-packages/reportlab/platypus/paragraph.py", line 813, in _setup
> style, frags, bulletTextFrags = _parser.parse(text,style)
> File "/usr/lib64/python2.4/site-packages/reportlab/platypus/paraparser.py", line 881, in parse
> self.close() # force parsing to complete
> File "/usr/lib64/python2.4/site-packages/reportlab/lib/xmllib.py", line 521, in close
> AttributeError: 'NoneType' object has no attribute 'close'
It's entirely possible that this is caused by threading if that is in operation
in the Django server. Reportlab is not and likely never will be thread safe.
There are all sorts of module level state which is shared between any casual
thread that wishes to use it. In this case I suspect that the code in
paragraph.py which says
#our one and only parser
# XXXXX if the parser has any internal state using only one is probably a BAD idea!
is to blame. While you run your long job another thread can come along and waste
your effeorts by grabbing this parser re-initializing it and then closing it
before you get to the end of the original job. Then your long job is dangling
with an ill defined state and the close of an object that has already been set
to None is the place where things come crashing down.
> Also, seeing as this crash happened with Reportlab 2.3 is it perhaps
> possible that upgrading to Reportlab 2.5 could fix the problem?
> Really sorry I can't provide more detailed information, I wish I had a
> proper traceback.
the above is a proper traceback, it's not annotated with variables like django's
though. The annotation can cause problems of its own as in packing a 2Mb pdf
outcome into the error traceback html etc etc :)
I should say that we use reportlab inside django all the time, but it is always
in single threaded mode (we use forked fastcgi).
More information about the reportlab-users