[reportlab-users] Chasing pyHnj error

Robin Becker robin at reportlab.com
Sun Nov 25 09:54:34 EST 2007


Dinu Gherman wrote:

> Robin Becker:

>

>> As for new paragraph classes I have been messing with Kuchling's

>> tex_wrap and have managed to get the run time down to something

>> reasonable. In python I have the time down from over 3.06" to 0.5".

>> This just corresponds to the simplest of line breakings. His version

>> doesn't do hyphenation or even follow Knuth's algorithm all that

>> carefully.

>>

>> I believe it should be possible to improve much further. I translated

>> simply to pyrex and have the time down again to 0.23". With more

>> agressive C style coding that should improve further (perhaps another

>> factor of 10 since my pyrex version is still using lots of python

>> variables in the main loop).

>

> Well, I first strive for having a paragraph class which:

>

> - is human-readable

> - has a pylint coefficient of => 7.5/10 (instead of the current

> value of 0.75/10 for RL's paragraph.py ;-)

> - has hyphenation

> - has inline images, and

> - has an extension mechanism for custom tags

>

> Once I'll have that I'll come back to you for ideas on performance.

> ;-)


well the current paragraph is slowly getting inline images and has had a
mechanism for non-character things for a long time; that's used for
anchor targets etc etc. Custom tags presumably means an adjustable
parser ie all non-standard tags get mapped to a user function; that
shouldn't be too hard if there's some agreement about what a tag is
allowed to produce. Hyphenation is just difficult; we need to define
words before we can find hyphenation points. The current paragraph
doesn't actually see words very well. Presumably

"hyp<font color=red>hena</font>te"

is a word, but for efficiency we don't actually see the "hyphenate" all
in one place so extracting the word and looking for hyphenation points
is left as an exercise.
--
Robin Becker


More information about the reportlab-users mailing list