[reportlab-users] Unicode handling bugs - 2.0
Greg Phillips
greg.phillips at rmc.ca
Mon Jun 5 09:41:38 EDT 2006
First of all, thanks much for going Unicode in 2.0, and especially
for fixing the KeepTogether bug. Both of those simplify my life
enormously.
I've discovered some bugs relating to the Unicode change.
In paragraph.py, there are two places (lines 279 and 298) where tests
like:
if type(bulletText) is StringType:
are made to determine whether the bullets are text or lists of
fragments. This breaks for the obvious reason if the bullet text is
unicode. I suggest changing these lines to:
if isinstance(bulletText, basestring):
There's a similar error at line 1186 of pdfdoc.py. A quick grep shows
other instances of "is StringType" in the library, but I haven't
investigated whether these are bugs or not.
Also, in paraparser.py, line 710, there's a conversion to cp1252
encoding to make sgmlop happy; this was causing errors when my input
included characters that weren't recognized in that encoding.
Changing the encoding to utf-8 seemed to solve the problem, but I
don't know enough about what's really going on there to know if
that's the Right Thing To Do.
Greg
More information about the reportlab-users
mailing list