[reportlab-users] Incorrect character composition
Robin Becker
robin at reportlab.com
Fri Apr 17 10:22:34 EDT 2015
Who is responsible for glyph positioning. I believe it is the font + the
renderer who is responsible.
I wrote the script below to test various diacritic behaviours in reportlab.
The TLDR is as follows, the TTF fonts seem to know about diacritics. The adobe
builtins may or may not know about them, but with our standard encoding
Helvetica clearly doesn't.
The script draws space + glyph + diacritic for some upper and lower case roman
letters. It also draws the same after unicode normalization.
Where seen, all the diacritics have zero width. The DejaVuSans font seems to do
slightly better than Arial in centring the common diacritics, where available
the composed glyphs (obtained by normalization) seem much better.
With no width for centring it would seem we need to examine the curves to get
any kind of centring right. DejaVu & Arial have some built in negative shifts as
can be seen by examining the tilde
> C:\tmp>python
> Python 2.7.8 (default, Jun 30 2014, 16:08:48) [MSC v.1500 64 bit (AMD64)] on win32
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from reportlab.pdfbase.pdfmetrics import registerFont
>>>> from reportlab.pdfbase.ttfonts import TTFont
>>>> registerFont(TTFont('DejaVuSans','DejaVuSans.ttf'))
>>>> from reportlab.graphics.charts.textlabels import _text2PathDescription
>>>> p=_text2PathDescription(u'\u0303',fontName='DejaVuSans',fontSize=2048)
>>>> p
> [('moveTo', -518, 1370), (u'lineTo', -575, 1425), (u'curveTo', -589, 1438, -602, 1448, -613, 1454),
> (u'curveTo', -624, 1460, -634, 1464, -643, 1464), (u'curveTo', -668, 1464, -687, 1452, -699, 1427),
> (u'curveTo', -711, 1403, -717, 1364, -719, 1309), (u'lineTo', -844, 1309),
> (u'curveTo', -843, 1399, -825, 1468, -791, 1517), (u'curveTo', -757, 1566, -710, 1591, -649, 1591),
> (u'curveTo', -624, 1591, -601, 1587, -579, 1577), (u'curveTo', -558, 1568, -535, 1552, -510, 1530),
> (u'lineTo', -453, 1475), (u'curveTo', -439, 1462, -426, 1452, -414, 1445),
> (u'curveTo', -404, 1439, -394, 1436, -385, 1436), (u'curveTo', -360, 1436, -341, 1448, -329, 1472),
> (u'curveTo', -317, 1496, -311, 1536, -309, 1591), (u'lineTo', -184, 1591),
> (u'curveTo', -185, 1501, -203, 1432, -237, 1382), (u'curveTo', -271, 1334, -318, 1309, -379, 1309),
> (u'curveTo', -404, 1309, -427, 1313, -449, 1323), (u'curveTo', -470, 1332, -493, 1348, -518, 1370),
> 'closePath']
>>>> registerFont(TTFont('Arial','Arial.ttf'))
>>>> pa=_text2PathDescription(u'\u0303',fontName='Arial',fontSize=2048)
>>>> pa
> [('moveTo', -909, 1547), (u'curveTo', -909, 1615, -891, 1670, -853, 1712),
> (u'curveTo', -816, 1754, -767, 1775, -706, 1775), (u'curveTo', -665, 1775, -609, 1757, -537, 1721),
> (u'curveTo', -498, 1701, -467, 1691, -443, 1691), (u'curveTo', -403, 1691, -378, 1720, -370, 1778),
> (u'lineTo', -240, 1778), (u'curveTo', -244, 1626, -309, 1550, -436, 1550),
> (u'curveTo', -478, 1550, -533, 1568, -602, 1606), (u'curveTo', -646, 1630, -679, 1642, -700, 1642),
> (u'curveTo', -752, 1642, -778, 1611, -776, 1547), (u'lineTo', -909, 1547), 'closePath']
>>>>
ie the curve starts at -518/2048 and goes at least to -844/2048, but it's clear
no single shift can match the various upper and lower case widths that could
occur. The arial curve is even more negative.
If a combined glyph is in the font we should use it, I'm not sure we even have
an api for that; TTFont has charToGlyph unicode-->glyph number, but we have code
to escape if there are no glyph components defined for it so the test is quite hard.
Otherwise, generating a missing combined glyph dynamically is probably the way
to go, but to do that we need information about how each combining character is
supposed to be positioned. The alternative is to attempt to do the adjustment
every time we render text using pdf operators; we still need the same information.
#################################################################
from reportlab.pdfbase.ttfonts import TTFont
from reportlab.pdfbase.pdfmetrics import registerFont
from reportlab.pdfgen.canvas import Canvas
from reportlab.lib.pagesizes import A4 as pagesize
from reportlab.lib.utils import uniChr
from unicodedata import normalize as unormalize
registerFont(TTFont("Arial", "Arial.ttf"))
registerFont(TTFont("DejaVuSans", "DejaVuSans.ttf"))
c = Canvas('tdiacritics.pdf', pagesize=pagesize)
y0 = pagesize[1]-12
for fontName in ('Arial','DejaVuSans','Helvetica'):
c.setFont(fontName, 10)
y = y0
y -= 12
c.drawString(18,y,fontName)
for diacritic in range(0x300,0x370):
if y-24 < 0:
c.showPage()
c.setFont(fontName, 10)
y = y0
y -= 12
c.drawString(18,y,fontName)
y -= 12
x = 18
diacritic = uniChr(diacritic)
c.drawString(x,y,hex(ord(diacritic)))
x += 40
u = u' '+diacritic+(u' w=%s'%c.stringWidth(diacritic))
c.drawString(x,y,u)
x += max(c.stringWidth(u),40)
for g in u'AEIOUYaeiouy':
u = ' '+g+diacritic
c.drawString(x,y,u)
x += 20
c.showPage()
c.setFont(fontName, 10)
y = y0
y -= 12
c.drawString(18,y,fontName+' normalized')
for diacritic in range(0x300,0x370):
if y-24 < 0:
c.showPage()
c.setFont(fontName, 10)
y = y0
y -= 12
c.drawString(18,y,fontName+' normalized')
y -= 12
x = 18
diacritic = uniChr(diacritic)
c.drawString(x,y,hex(ord(diacritic)))
x += 40
u = u' '+diacritic+(u' w=%s'%c.stringWidth(diacritic))
c.drawString(x,y,u)
x += max(c.stringWidth(u),40)
for g in u'AEIOUYaeiouy':
u = unormalize('NFC',' '+g+diacritic)
c.drawString(x,y,u)
x += 20
c.showPage()
c.save()
#################################################################
--
Robin Becker
More information about the reportlab-users
mailing list