[reportlab-users] Incorrect character composition
Glenn Linderman
v+python at g.nevcal.com
Wed Apr 15 04:01:40 EDT 2015
On 4/14/2015 10:25 PM, Marius Gedminas wrote:
> On Tue, Apr 14, 2015 at 12:05:04PM -0700, Glenn Linderman wrote:
>> 6-7 weeks with no response, for a while I thought the list was dead, but now
>> a flurry of messages....
>>
>> I guess I didn't actually ask a question, but is this, like kerning, thought
>> to be too slow to implement, or is it just that the market for reportlab
>> simply doesn't include languages that don't have precomposed glyphs, or
>> something else?
> When I and Vika originally implemented Unicode + TrueType support in
> ReportLab, we didn't implement support for combining characters. I
> don't remember if the TTF/PDF specifications available at that time included
> such support.
>
> I guess nobody stepped up to add the missing support since then.
>
> Technical details (that might be wrong if the code changed since 2003,
> which it probably did--I wasn't keeping track): ReportLab takes apart
> the TTF and builds multiple fonts, each containing a subset of the
> original glyphs (up to 256). These subsets discard any and all TTF
> tables not explicitly copied, which I guess include the tables used for
> rendering combining characters in a nice way.
Thanks for the response and explanation of the history... By the time I
started using reportlab, TTF support already existed. Maybe no one else
knew that combining character support didn't exist.
In looking at the details of PDF files, it seems that when kerning _is_
supported, it is not done by copying TTF kerning tables into Postscript
Fonts, but rather by explicitly coding the distance to advance the caret
when laying down text.
While I was waiting for a response, I was speculating about how
combining characters might be coded in PDF files, but I haven't found
any that contain combining characters that I have been able to take
apart and examine (and I'm no expert at such examination).
I speculated that if kerning tables are not copied and used by the
display engines, that probably combining character X & Y positions would
not be either. The conclusion of that speculation was that it would be
the responsibility of the PDF file creation tool, which has all the
available font tables, to use the kerning feature to adjust the X
position, and possibly just absolutely position the Y position.... I
didn't find (but may have overlooked) any technique in the PDF
specification for adjusting the Y position of the combining character
short of simply making it its own character stream, with an adjusted Y
baseline.
>> On 2/21/2015 1:18 PM, Glenn Linderman wrote:
>>> I've suddenly discovered a need to use Unicode characters that do not fall
>>> into the category of "precomposed glyphs", instead being forced to use
>>> "combining characters" for certain diacritical marks.
>>>
>>> However, the combined result from reportlab looks rather stupid compared
>>> to the results seen in other programs (browsers, text editors, word
>>> processors, etc.). I even displayed the results in two different PDF
>>> viewers, Sumatra and Adobe, before concluding it must be a reportlab
>>> thing.
>>>
>>> The problem is with the characters called open o (upper and lower case),
>>> and open e (at least lower case, the upper case version looks better, but
>>> that may be more due to the open E being narrower than due to proper
>>> handling) when combined with the combining tilde, and other similar
>>> diacriticals.
>>>
>>> In my sample at http://nevcal.com/temporary/openo.pdf I've also included a
>>> precomposed ã and Õ as well, for comparison of where the tilde should be
>>> placed. Here are the same characters in email... I note that in my email
>>> client (Thunderbind) the precomposed tildes are slightly closer to the
>>> characters than the combining tilde, but in the reportlab-generated PDF,
>>> the lower case combining tildes are far too high, and those over (wider)
>>> upper case characters are not centered. Times New Roman font in both this
>>> email (unless your client or the mailing list strips the fonts) and the
>>> PDF.
>>>
>>> Glenn
>>>
>>> ɔãɔ̃ÕƆ̃ɛɛ̃Ɛ̃
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/reportlab-users/attachments/20150415/18d5fb6f/attachment.html>
More information about the reportlab-users
mailing list