[reportlab-users] Incorrect character composition

Mon Apr 20 08:32:18 EDT 2015

On 20/04/2015 11:54, Glenn Linderman wrote:
> On 4/20/2015 2:20 AM, Robin Becker wrote:
>> ........
..........
>> 1) try an experiment to see if PDF renderers will accept the GPOS information
>> in a specific font and make good use of it. I guess we can use illustrator or
>> equivalent to make a sample document. Examining the dejaVuSans font shows it
>> certainly has GPOS information.
>
> Maybe. The attempt will also be instructive regarding how Illustrator might
> handle such combined characters... if it does (I don't have Illustrator to test
> with, but since it is from Adobe, it well might)... and what the generated PDF
> looks like... if it contains positioning instructions, or depends on the PDF
> display tools to have a good renderer.
>
yes that's what I wanted to find out ie a) does the gpos info help and b) will 
renderers take notice and c) how does a 'proper' implementation of subsets work.


>>
>> 2) If the answer to 1 is yes then we'll need to parse the GPOS information and
>.........
>> encodings if many glyph+diacritic pairs are used.
>
> If #2 applies, such an analysis of encodings is probably best done after seeing
> all the combinations used in the file.  Would it make sense to have an iteration
> inside build() just to collect all the characters used in a document for such an
> analysis? I've really no clue at what iteration the current font subset
> generation takes place, whether it is first, last or somewhere in the middle...
> nor do I have a clue if more characters get added in various phases due to
> repagination, etc.
....
I'm not certain any extra pass will be required unless we want to be 'optimal' 
in some sense. Currently we make subsets on demand ie in response to the glyphs 
that are actually used. If we see usage of diacritics we will need to ensure 
that splitString(text,doc) --> (subset0, bytes0),... does the right thing if it 
sees a pair of glyphs that must be in the same font then it will have to ensure 
that even if it means a particular glyph gets duplicate mappings. Normally we 
create a new subset only when the previous one gets filled, but I suspect we may 
need to allow subsets to be created in more places and for more reasons. Luckily 
all these 'dynamic' fonts are lazily constructed afterwards so we could maintain 
separate diacritic usage subsets if needed.
-- 
Robin Becker