[reportlab-users] Re: Using UTF-8 strings with ReportLab.

Robin Becker robin at reportlab.com
Wed Dec 21 03:49:38 EST 2005


Andy Robinson wrote:
>> That's good to hear; as I've posted previously, I have UTF-8 working 
>> perfectly with ReportLab using TTF fonts - the only thing I'd ask if 
>> at all possible is that I could give a series of fonts for a string 
>> and then if a particular glyph was missing from the first, it would go 
>> down the line trying each font in turn. On PledgeBank, I have to use 
>> different fonts manually in order to get characters to display.
> 
> 
> I'm glad you mentioned that.
> 
> We will try to get a new wiki up to collect all the use cases.  My
> goals for the very first 'version 2.0' are modest enough:  utf8
> and unicode strings should go through uncorrupted, and text should
> come out right whether used in paragraphs, drawString, table cells
> and graphics.  (Right now you have to escape things differently
> in different places, and we have magic escapes for greek letters
> and symbols when one could just use the Unicode characters). That's
> why the new version number - it will break old code.


I beg to differ on this point. In the utf8 branch all characters are 
supposed to be uniformly usable everywhere if in utf8-string/unicode. In 
paragraphs we have an additional facility ie we can use some xml 
notations. That should not be considered a problem since nobody is 
forced to use them.

The main remaining difficulty with the utf8 branch is how to deal with 
out of band characters. Not every utf8/unicode font supports every glyph 
so there is a kind of fall back mechanism in place which eventually 
comes back to the missing glyph and even that is not uniform amongst 
fonts; we do it manually for T1 fonts and seemingly leave it to the 
canvas for ttf.

> 
> Part of the challenge is to make more easily subclassable 'text 
> handlers' for drawing strings and paragraphs, so one can do what
> you want above, and also produce hyphenation/breaking algorithms
> not in the core.
> 
> There's also a reasonable chance the first cut will run like a
> snail because of this, but once the strategies are clear we
> can move font lookups into _rl_accel.
> 
> 
> 
>> Yay. :-)
> 
> 
> Yes, slightly hungover this morning ;-)
> 
> 
> - Andy
> _______________________________________________
> reportlab-users mailing list
> reportlab-users at reportlab.com
> http://two.pairlist.net/mailman/listinfo/reportlab-users
> 
> 


-- 
Robin Becker



More information about the reportlab-users mailing list