[reportlab-users] Incorrect character composition

Mon Apr 20 05:20:15 EDT 2015

........
> The problem is that ReportLab doesn't embed the font directly.  Instead
> it constructs multiple subsets (each with < 256 codepoints), and those
> subsets constructed by ReportLab do not have GPOS information (check the
> TTFontFile.makeSubset method to see what TTF tables are copied and how
> they're transformed; my apologies about the terrible code you'll find
> therein).
>
> The GPOS table cannot be copied directly: subsetting changes glyph
> numbering, so the GPOS table would have to be taken apart and
> reconstructed with the renumbered glyphs.
>

well I guess the way to go is

1) try an experiment to see if PDF renderers will accept the GPOS information in 
a specific font and make good use of it. I guess we can use illustrator or 
equivalent to make a sample document. Examining the dejaVuSans font shows it 
certainly has GPOS information.

2) If the answer to 1 is yes then we'll need to parse the GPOS information and 
construct subsets that keep the required pairs together. From my understanding 
of the way PDF uses text I see little hope of constructing a single font that 
does this for all glyphs in a simple way (section 3.2.3 of the 1.7 PDF spec says 
"A string object consists of a series of bytes—unsigned integer values in the 
range 0 to 255"), so we're apparently limited to encodings of length 256 or 
less. Presumably we'll have to be really smart about constructing our encodings 
if many glyph+diacritic pairs are used.
-- 
Robin Becker