[reportlab-users] RTL Patch Committed

Yoann Roman yroman-reportlab at altalang.com
Mon Nov 23 15:04:25 EST 2009


Andy Robinson wrote...

> 9. collect tests in Arabic, Hebrew, Farsi and any other widely used

> languages which can use this, along with font instructions


Since I keep bringing this up, I figured I'd actually *show* what I'm
talking about. Attached are three files:

pashto.py: Pashto test script that prints code points after
Pyfribidi, prints original code points, and creates PDF with both

pashto.pdf: the PDF created by the above script

pashto-correct.pdf: PDF created by Acrobat 9 on Windows from a Word
document with the same Pashto text

To see the issue, compare the top line of the two PDFs. Note how the
3rd and 8th words (from the right) aren't getting shaped properly by
Pyfribidi (I had our Pashto speakers confirm the Word version is
correct).

Pyfribidi bases its shaping on the decomposition mappings in the UCD:
ftp://ftp.unicode.org/Public/5.1.0/ucd/UnicodeData.txt

The unshaped character in the 3rd word is code point 0681. That code
point has no mappings in the UCD, so Pyfribidi does no shaping on it.

However, the code point does support dual joining:
ftp://ftp.unicode.org/Public/5.2.0/ucd/ArabicShaping.txt

This isn't a problem with Pyfribidi; modern shaping should be done by
glyph selection using OpenType information, which Pyfribidi clearly
can't do without knowing the target font:
http://www.microsoft.com/typography/otfntdev/arabicot/features.htm

Again, I haven't seen something that does this in Python, but I think
this issue should be mentioned in a RL release with Pyfribidi to avoid
problems from people assuming proper shaping for all languages.

--
Yoann Roman
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: pashto.py
Url: <http://two.pairlist.net/pipermail/reportlab-users/attachments/20091123/ed045cb9/attachment-0001.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pashto-correct.pdf
Type: application/pdf
Size: 18572 bytes
Desc: not available
Url : <http://two.pairlist.net/pipermail/reportlab-users/attachments/20091123/ed045cb9/attachment-0002.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pashto.pdf
Type: application/pdf
Size: 41952 bytes
Desc: not available
Url : <http://two.pairlist.net/pipermail/reportlab-users/attachments/20091123/ed045cb9/attachment-0003.pdf>


More information about the reportlab-users mailing list