[reportlab-users] Text with multiple languages.
Glenn Linderman
v+python at g.nevcal.com
Mon Oct 27 15:44:14 EDT 2014
On 10/26/2014 12:48 PM, Tim Butram wrote:
>
> Unfortunately, I found that generating pdfs with multiple languages
> was too difficult, and moved to using HTML and utf-8.
>
> A web/browser based solution worked really well for me.
>
> On Oct 26, 2014 12:15 AM, "Steve Young" <wereapwhatwesow at gmail.com
> <mailto:wereapwhatwesow at gmail.com>> wrote:
>
> Hi Tim, I am working on a project with similar needs as you
> mentioned. If you found answers would you mind sharing?
>
> Thanks.
>
> On Thursday, July 31, 2014 2:37:08 PM UTC-5, Tim Butram wrote:
>
> I'm generating a document that has a wide varity of languages,
> from Arabic to English to Chinese. Unfortunatly, I'm unable to
> find a single font that will allow me to print such a varity
> of characters. Additionally, I don't have previous knowledge
> of what language the String I'm trying to print out which
> makes it difficult to switch fonts based on the contents of
> the text.
>
> What are some solutions to this problem?
>
> Thanks,
> Tim
>
I had earlier questions on this list regarding multi-font PDF
generation. It seems that reportlab doesn't attempt to do font
substitution, but the browsers do.
I basically gave up on the attempt at that time to generate a Chinese &
English combination text, because at the time it would have had to have
been coded in Python 2, and I was a beginning user of Python 3.
The general technique that seems to be necessary is to do your own font
selection based on incoming codepoints, and explicitly tell reportlab to
switch fonts when needed.
One could investigate the ways that browser to font substitution since
several are open source.
Alternately, one could do a Google search on the topic of font
substitution, and perhaps find academic writings or practical solutions
to the matter.
My best speculation without resorting to the above is as follows:
given a collections of scripts that are expected to arrive in UTF-8,
find a suitable font for use with each script. Determine the "coverage"
of each font with respect to codepoints. Choose a current font, and
select it for use. For each codepoint processed, if it is not covered by
the current font, choose as current a font that does include that
codepoint, and select it for use.
The fonts may overlap in their coverage; you need to determine whether
you wish to do any refinements to the above algorithm, such as having a
preference for particular fonts for particular codepoints, and adjust
your "coverage" tables appropriatley, rather than using everything that
is covered by a font, and/or algorithms for choosing a particular font
among several that might cover the current codepoint (possibly by doing
look back or look ahead to determine the length of the run of characters
that might be covered by a particular font).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://pairlist2.pair.net/pipermail/reportlab-users/attachments/20141027/0d9b7475/attachment.html>
More information about the reportlab-users
mailing list