[reportlab-users] Incorrect character composition

Wed Apr 15 04:48:39 EDT 2015

On 4/15/2015 1:19 AM, Andy Robinson wrote:
> I guess if anyone is to blame for this, it's me as ReportLab's founder.

I'm not looking to blame anyone, just trying to understand if there is 
or will be a solution to a problem.

> The closest we got to even half-understanding the problem was about 6
> years ago when an Arabic-speaking employee with a little knowledge of
> Farsi took a look.  Unfortunately we are not rendering raster graphics
> on screen.  We are trying to work out the right font descriptors and
> sequences of bytes to put in the PDF file so that the right stuff
> magically happens on screen.   When I did that with Japanese in about
> 2002-2003, with the advantages that (a) I can read and write the
> language and (b) there is no special layout at all, it still took a
> month of reverse-engineering other peoples' PDFs.    Not knowing any
> of these languages, it's probably a big job, and we have not had any
> volunteers from the open source community, nor any customers willing
> to pay for the R&D.

While I ran into the issue with Latin-based languages with unusual 
(among Latin-based languages) diacritical marks, I suppose you are right 
that support for Arabic and Hebrew (ha, and I think Thai is one of the 
worst at using combining characters, so don't forget Thai) is required 
for a general solution. Obviously, supporting all those languages makes 
the problem much larger than support LTR Latin+diacritical marks with 
proper positions.  Vietnamese seems to be another of interest in the 
smaller subset, though, as it uses two diacriticals on many vowels 
whereas the languages I'm dealing with so far only have one... but on 
some uncommon "extended Latin" characters. Vietnamese may hit me 
someday, but it hasn't yet.

> I don't think it's a performance issue like kerning.  I would
> sincerely hope that one just has to put the right byte sequences into
> the PDF and that the font sorts it out for you.

After raising the issue of kerning, I found there is some limited 
support for kerning from 3rd parties outside of platypus, which might be 
sufficient for the type of international typesetting I'm doing. But it 
seems to me that even if proper kerning is a performance issue, if it 
can be turned on & off, then people that care could get good results, 
and people that don't could have fast results. Right now, people that 
care don't have a solution at all (in Python, for generating PDFs).

Still, without the combining character support, and if there is no 
"quick fix" (which I wouldn't expect if the general problem is solved), 
I'll likely have to look at other solutions and even other languages, 
probably, for generating my PDFs. Which I regret, because (1) I'm 
already using reportlab for current languages (2) it has a nice design 
(3) I like coding in Python.

> If anyone here is willing to help have a crack at it, stick their
> hands up and I can suggest a general approach.   We have done some
> work in the past month updating the pyfribidi extension to compile on
> Python 2.7, 3.3 and 3.4, whch is a prerequisite for anything here.  I
> think we probably need to crack Arabic and Hebrew as a first step.
>
> - Andy
> _______________________________________________
> reportlab-users mailing list
> reportlab-users at lists2.reportlab.com
> https://pairlist2.pair.net/mailman/listinfo/reportlab-users
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/reportlab-users/attachments/20150415/78fc12bb/attachment.html>