[reportlab-users] ReportLab / Python 3000 (Robin Becker)

henning.vonbargen at arcor.de henning.vonbargen at arcor.de
Wed Mar 12 16:12:23 EDT 2008


Robin Becker wrote:

> >>

> >> I'm curious to know if anyone here has plans to move rapidly onto

> >> Python 3.0? It's now in alpha 2 and slated for release this summer,

> >> although not many other libraries are following suit.

> .......

>

> There is a real problem here related to the way we do things.

> In 3 all strings are unicode so that eases one whole class

>of problems ie we get rid of all the conversions involving utf8 in the

> various front end bits.

> On the other hand all output has to be in bytes

> so literally every piece of output needs to be converted.

> The question is where?

> Should the conversions be done close to the creation or only on output?


I think that RL is not the only libray that faces this problem.
So personally I think it'll take a few years before all the libraries I'm using
will work flawlessly with Python 3.
Nevertheless, it is good to think about these issues now.

I found myself struggling with RL2 when developing the hyphenation
library wordaxe (formerly: deco-cow). That's because in RL2 it is not clear
when to use unicode and when to use byte strings, so more or less every
single function has to accept unicode or byte string arguments.
And then, if a conversion is necessary, it's often not clear which codec
should be used.

A clear borderline between unicode and byte strings could help a lot.

Thinking about it, perhaps there is a natural borderline when PDF output
is generated (that is, on the lowest level), so I propose to:
* Use unicode strings (and perhaps even add assertions or annotations
to check this!) throughout the platypus package as well as for the
arguments of the Canvas methods.
* Do the conversion to bytes inside the Canvas methods.
* Make sure that inside a class, all arguments are either unicode
or byte strings and document it at the class level.
* (Even better: All classes inside a package use the same convention)
* Split some classes that contain internal methods as well as commonly
used methods (for example, PDFDocument) into a "public" class
(setAuthor, setTitle, ...) and an internal class.
* Use the opportunity to make the code more object-oriented.
For example, the many "_"-functions in paragraph.py make it impossible
to simply extend the Paragraph class to support hyphenation).

Henning



More information about the reportlab-users mailing list