[reportlab-users] Incorrect character composition

Fri May 29 00:34:57 EDT 2015

On 5/28/2015 7:08 PM, Glenn Linderman wrote:
> On 4/15/2015 10:46 PM, Glenn Linderman wrote:
>> There are, I think, 4 issues, the first two of which I could 
>> definitely use if implemented, and which sound relatively easy, but 
>> likely have performance impact. They would enable _higher quality 
>> typesetting_ of Latin-based text into PDF files. The others could be 
>> hard, but would be required to support a wider range of languages 
>> with non-Latin fonts.  I did read something recently about Micro$oft 
>> producing a font layout system (but they used a different word in the 
>> article that I cannot come up with right now) for all the various 
>> needs of different language systems... The closest thing I can find 
>> with Google right now is their DirectWrite, but whether it 
>> incorporates the technology I read about, I couldn't say, but maybe 
>> it does or will. I don't recall if this was something they were 
>> making generally available to make the world's typography improve, or 
>> if it was a proprietary come-on to promote/improve Windows. It 
>> sounded pretty general, language-wise.
>>
>>  1. kerning
>>  2. composite glyph positioning
>>  3. Languages with huge numbers of ligatures, where characters appear
>>     differently, even to the point of requiring different glyphs, at
>>     the beginning or end of words (Arabic) or adjacent to other
>>     letters (Thai).
>>  4. RTL languages.
>>
>>
>> 1. kerning
>>
>> My research into kerning is below, since it was somewhat productive. 
>> Most of it was on this list. I have not had time to research 
>> composite glyph positioning, which
>
> Seems I forgot to finish this sentence in the original email, as the 
> next thing was a different paragraph. And I won't now either, because 
> I don't know what I was going to say.
>
> In what I said in the first paragraph about M$ producing a font layout 
> system, here is the link where you can read what I read about that, 
> which I finally found again.
>
> http://blogs.windows.com/bloggingwindows/2015/02/23/windows-shapes-the-worlds-languages/
>
> This sort of technology and complexity would be required for items 3 & 
> 4 in my complexity list. Probably it also handles 1 & 2, but they 
> would be the simple cases.
>
> There are some interesting statistics in the article about numbers of 
> languages that require shaping engine support, as well as a pointer to 
> the full specification for the Universal Shaping Engine, which is 
> somewhat eye-glazing, and yet only an overview of the categorizations 
> done, not containing any implementation hints that I could see.
>
> I've no idea how many languages are supported by Cairo, say, which is 
> the graphics rendering system used by Firefox (and likely open 
> source), nor how well. Cairo does handle all the cases I've mentioned.

Hmm. Some more Googling found references to OpenType/Pango/Harfbuzz, 
start here <http://en.wikipedia.org/wiki/Pango>, and they claim to be 
what is used by Firefox (rather than Cairo? In addition to Cairo?). 
Pango presentation <http://fishsoup.net/bib/PangoIuc25-slides.pdf> 
claims it transforms from UTF-8 text to positioned glyphs (exactly what 
is needed for PDF files, it would seem); maybe Cairo then is the "how to 
render positioned glyphs to graphics images" part (exactly what is done 
by PDF viewers, it would seem). All this stuff seems to be open source. 
Not sure if the Harfbuzz shaping engine supports all the languages that 
M$'s Universal Shaping Engine does... This old blog post 
<http://mces.blogspot.in/2009/11/pango-vs-harfbuzz.html> contrasts what 
Pango and Harfbuzz are and aren't, and how they can work together. No 
doubt things have changed somewhat since then.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/reportlab-users/attachments/20150528/c3f6ab51/attachment.html>