[reportlab-users] TTF Problem

Tim Roberts timr at probo.com
Mon Feb 27 14:04:01 EST 2006


On Mon, 27 Feb 2006 11:58:17 -0000, "Ian Millington"
<reportlab-user at agon.com> wrote:

>In symbol fonts, the characters seemed to have the correct code, but with
>0xf000 added to each value. 
>
>>From what I can tell this is part of the TrueType Specification - it is the
>'user' area, and is recommended for use in symbol or dingbat fonts
>(wingding.ttf has this format too).
>
>Most Windows applications (all that use standard GDI font-handling routines)
>automatically subtract 0xf000 from the code point when dealing with one of
>these fonts, so mapping the symbols onto the regular ASCII code values.
>Because this is automatic, most vendor's tables of symbols are in turn given
>with the 0xf000 already removed (including tools like character map which
>give the subtracted code point). This is why my font worked fine in Word,
>Illustrator, etc, but not in Reportlab.
>
>Could I suggest this information is placed somewhere where folk may find it?
>Is there a repository of wisdom somewhere?
>
>I don't think Reportlab should join Microsoft in supporting the subtraction,
>because it seems like changing around code-points automagically might be
>asking for subtle bugs.
>


This is all working as designed; there is nothing particular subtle
about it.  TrueType font files use Unicode internally.  Since most of
the applications in the world are not yet fully Unicode-ized, there has
to be some mapping from the 8-bit character sets everyone uses to the
Unicode code points in the TTF files.  For character sets like ANSI or
iso-8859-1 or whatever, the operating system has tables that map the
character-set codes to Unicode code points, so they may be found in the
TrueType files.  (Python's encode/decode stuff also uses such tables.)

Symbol fonts have the same basic problem to solve.  The Symbol font
character set is an 8-bit character set that runs from 0x20 to 0xFF. 
The mapping from that 8-bit character set to Unicode happens to be much
simpler than most character set mappings: you just add 0xF000 to the
character set to get the Unicode code point.

Now, it IS possible to create TrueType files that do not use Unicode
code points.  The TrueType tables allow one to create a "custom"
encodings.  However, Windows doesn't support them.  It is very picky
about which encodings it will allow.

-- 
Tim Roberts, timr at probo.com
Providenza & Boekelheide, Inc.



More information about the reportlab-users mailing list