[reportlab-users] RTL Patch Committed

Robin Becker robin at reportlab.com
Thu Nov 19 08:46:30 EST 2009

Previous message: [reportlab-users] RTL Patch Committed
Next message: [reportlab-users] RTL Patch Committed
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hosam Aly wrote:
..........

> I have committed a patch to the "rtl-support" branch, adding partial RTL

> support in "src/reportlab/pdfgen/textobject.py" and

> "src/reportlab/platypus/paragraph", based on pyfribidi.

........

>

> Out of 15 test cases, 15 succeeded and 1 failed, which is the sixth one

> (line 6 in the generated PDF). The Arabic words should have appeared at

> the beginning of the line, but instead they appear at the end. This

> issue needs more investigation.

>

thanks and great work.

>

> There is still the issue of connecting letters, but that's more related

> to Complex Text Layout (CTL) than it is to supporting BiDirectional

> rendering. I'll try to tackle that issue next. I am thinking of using

> PyICU instead of PyFriBiDi, but I still have to read more about it.

>

>

> Meanwhile, I read in the PDF standard (version 1.7 from Adobe) that the

> PDF text object supports receiving UTF-16BE text, provided that it

> starts with the Unicode Byte Order Mark (BOM, U+FEFF). I wonder what

> would be the results if we wrote text in UTF-16 instead of writing the

> code points in the font? I didn't know how to test this, so I hope

> someone can help me.

..........
We have used UTF16 in some places in pdfdoc.py. I believe that was related to
using CJK standard fonts in places where Acrobat Reader would normally use
pdfdoc encoding ie various comments and document description sections.

In principle there's nothing that's better about using a 16bit unicode
representation for text. I do see that for cmaps there are a lot of predefined
mappings which correspond to various utf16 subsets.

When the font is a builtin font I can see that using a standard encoding makes
sense. That is certainly the case for the standard AR cjk fonts where the fonts
are large and don't have to be embedded. However, we are often making up subset
fonts for embedding purposes and there I don't think it makes sense to use 16bit
entries.
--
Robin Becker

Previous message: [reportlab-users] RTL Patch Committed
Next message: [reportlab-users] RTL Patch Committed
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the reportlab-users mailing list