[reportlab-users] Incorrect character composition
Robin Becker
robin at reportlab.com
Mon Apr 20 08:32:18 EDT 2015
On 20/04/2015 11:54, Glenn Linderman wrote:
> On 4/20/2015 2:20 AM, Robin Becker wrote:
>> ........
..........
>> 1) try an experiment to see if PDF renderers will accept the GPOS information
>> in a specific font and make good use of it. I guess we can use illustrator or
>> equivalent to make a sample document. Examining the dejaVuSans font shows it
>> certainly has GPOS information.
>
> Maybe. The attempt will also be instructive regarding how Illustrator might
> handle such combined characters... if it does (I don't have Illustrator to test
> with, but since it is from Adobe, it well might)... and what the generated PDF
> looks like... if it contains positioning instructions, or depends on the PDF
> display tools to have a good renderer.
>
yes that's what I wanted to find out ie a) does the gpos info help and b) will
renderers take notice and c) how does a 'proper' implementation of subsets work.
>>
>> 2) If the answer to 1 is yes then we'll need to parse the GPOS information and
>.........
>> encodings if many glyph+diacritic pairs are used.
>
> If #2 applies, such an analysis of encodings is probably best done after seeing
> all the combinations used in the file. Would it make sense to have an iteration
> inside build() just to collect all the characters used in a document for such an
> analysis? I've really no clue at what iteration the current font subset
> generation takes place, whether it is first, last or somewhere in the middle...
> nor do I have a clue if more characters get added in various phases due to
> repagination, etc.
....
I'm not certain any extra pass will be required unless we want to be 'optimal'
in some sense. Currently we make subsets on demand ie in response to the glyphs
that are actually used. If we see usage of diacritics we will need to ensure
that splitString(text,doc) --> (subset0, bytes0),... does the right thing if it
sees a pair of glyphs that must be in the same font then it will have to ensure
that even if it means a particular glyph gets duplicate mappings. Normally we
create a new subset only when the previous one gets filled, but I suspect we may
need to allow subsets to be created in more places and for more reasons. Luckily
all these 'dynamic' fonts are lazily constructed afterwards so we could maintain
separate diacritic usage subsets if needed.
--
Robin Becker
More information about the reportlab-users
mailing list