[reportlab-users] Chasing pyHnj error
Robin Becker
robin at reportlab.com
Sun Nov 25 09:54:34 EST 2007
Dinu Gherman wrote:
> Robin Becker:
>
>> As for new paragraph classes I have been messing with Kuchling's
>> tex_wrap and have managed to get the run time down to something
>> reasonable. In python I have the time down from over 3.06" to 0.5".
>> This just corresponds to the simplest of line breakings. His version
>> doesn't do hyphenation or even follow Knuth's algorithm all that
>> carefully.
>>
>> I believe it should be possible to improve much further. I translated
>> simply to pyrex and have the time down again to 0.23". With more
>> agressive C style coding that should improve further (perhaps another
>> factor of 10 since my pyrex version is still using lots of python
>> variables in the main loop).
>
> Well, I first strive for having a paragraph class which:
>
> - is human-readable
> - has a pylint coefficient of => 7.5/10 (instead of the current
> value of 0.75/10 for RL's paragraph.py ;-)
> - has hyphenation
> - has inline images, and
> - has an extension mechanism for custom tags
>
> Once I'll have that I'll come back to you for ideas on performance.
> ;-)
well the current paragraph is slowly getting inline images and has had a
mechanism for non-character things for a long time; that's used for
anchor targets etc etc. Custom tags presumably means an adjustable
parser ie all non-standard tags get mapped to a user function; that
shouldn't be too hard if there's some agreement about what a tag is
allowed to produce. Hyphenation is just difficult; we need to define
words before we can find hyphenation points. The current paragraph
doesn't actually see words very well. Presumably
"hyp<font color=red>hena</font>te"
is a word, but for efficiency we don't actually see the "hyphenate" all
in one place so extracting the word and looking for hyphenation points
is left as an exercise.
--
Robin Becker
More information about the reportlab-users
mailing list