[reportlab-users] Hebrew Support Patch

Moshe Wagner moshe.wagner at gmail.com
Mon Jun 8 14:45:26 EDT 2009


OK, I'll try to make things a bit clearer.

First - fribidi (and therefor pyfribidi) is an "implementation of the
Unicode Bidirectional Algorithm (bidi)."
This 'Unicode Bidirectional Algorithm' does not apply to RTL texts
only. It applies to all text what so ever. The real correct way to
display *any* text, just in case it *might* have any RTL characters,
is to apply this algorithm to it.

So, ideally, my patch should not have any affect to users using only
LTR text, even if they have pyfribidi installed, and the algorithm is
applied to all texts.

But, there is one case where using the algorithm correctly might
cause problems.
When a paragraph is aligned to the right, even LTR text *should* be
treated as if they are part of an RTL paragraph, and therefor any sign
at the end of the line, i.e. periods, should be at the left, which is
actually the line's real end (Since the paragraph starts from the
right).
This may be the correct thing, but it's probably not what the users expect.
( In any case, this will not happen if pyfribidi isn't installed, so
it isn't so bad. )

I could disable this of course, and tell it to treat any paragraph as
an LTR one, but that will not be the right behavior for RTL or mixed
texts.

So it might actually make sense to use a special RTL type for RTL
paragraphs, so that when it is given, the algorithm will be used
correctly.

And since anyway, as I said, we need a way to determine the type of
the paragraph for the last line of fill justified paragraphs (deciding
whether it should be at the left or the right), this might be a even
better idea.

Is this the method I should go for?


Anyway, all that shouldn't cause any problems in the single line
displaying, since there the direction of the text is decided by the
first characters of the text, and not by the alignment.

Moshe


To make things clearer I attached the latest test archive, and here is
the whole latest patch:

This function should be added to to "paragraph.py" before the 'wrap'
function, but in the same class:
###############################
# Guesses the direction the given text should have (LTR or RTL), for
cases where it can't be decided by it's alignment
def guessBaseDirection(self, s):
# Since pyfribidi doesn't have an option to return fribidi's guess,
# I have to find out it's guess in a very ugly way

# This adds a neutral sign to the given text.
# Then the text is mirrored, letting fribidi to guess it's direction.
# If it's RTL text, the added sign will now become the first
character of the text,
# While if it's LTR the sign will stay at the end.
import pyfribidi

s += '.'
s = pyfribidi.log2vis(s,pyfribidi.ON)

if (s[0] == "."):
return pyfribidi.RTL
else:
return pyfribidi.LTR
###############################

And this should be added to the "wrap" function, right after the call
to "self.breakLines":
######################
# Hebrew text patch, Moshe Wagner, June 2009
# <moshe.wagner at gmail.com>

#This code fixes paragraphs with RTL text

# It does it by flipping each line separately.
# (Depending on the type of the line)

# If fribidi cant be imported, it does nothing
# Plain LTR texts will not be affected in any case.
try:
import pyfribidi
except ImportError:
import sys
print >> sys.stderr, "Fribidi module not found; You will not
have RTL support for this paragraph"
else:

#First, the base direction given to pyfribidi must be decided.
# In justified paragraphs, it's decided by their alignment.

# For now, there is only one type of fill justified paragraphs.
# So even though it acts like a left justified one,
# we cannot assume that's the alignment the text should have.
# So the direction is guessed by the first character of the text.

if self.style.alignment == TA_LEFT:
direction = pyfribidi.LTR
elif self.style.alignment == TA_RIGHT:
direction = pyfribidi.RTL
else:
# Get first character of the text:
c = ""
if isinstance(blPara.lines[0], (FragLine, ParaLines)):
if len(blPara.lines[0].words[0].text) < 2:
#This must be English, because Unicode chars take up 2
spaces in the array
direction = pyfribidi.LTR
else:
c = blPara.lines[0].words[0].text[0] +
blPara.lines[0].words[0].text[1]
elif isinstance(blPara.lines[0], tuple):
if len(blPara.lines[0][1]) < 2:
#This must be English, because Unicode chars take up 2
spaces in the array
direction = pyfribidi.LTR
else:
c = blPara.lines[0][1][0] + blPara.lines[0][1][1]
#Guess direction by it:
direction = self.guessBaseDirection(c)

for line in blPara.lines:
if isinstance(line, (FragLine, ParaLines)):
#When the line is a FragLine or ParaLines, Its
#text attribute of each of it's words is flipped.
#Then, the order of the words is flipped too,
#So that 2 word parts on the same line
#will be in the right order

for word in line.words:
word.text = pyfribidi.log2vis(word.text,direction)

line.words.reverse()

elif isinstance(line, tuple):
#When the line is just a tuple whose second value is the text.
#since I coulden't directly change it's value,
#it's done by merging the words, flipping them,
#and re-entering them one by one to the second attribute """

s = ' '.join(line[1])
s = pyfribidi.log2vis(s,direction)
line[1][:] = s.split()
else:
print line.__class__.__name__
######################

And this is added to the "canvas.py" file, right at the beginning of
the "drawString" function:
######################
# Hebrew text patch, Moshe Wagner, June 2009
# <moshe.wagner at gmail.com>

# Flips the given text with pyfribidi, if it's needed (i.e. Hebrew or Arabic)
# If it could not be imported, it does nothing
# Plain LTR texts will not be affected in any case.
try:
import pyfribidi
text = pyfribidi.log2vis(text,base_direction=pyfribidi.ON)

except ImportError:
import sys
print >> sys.stderr, "Fribidi module not found; You will not have RTL
support for this paragraph"
#####################
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Report Lab RTL Test.tar.gz
Type: application/x-gzip
Size: 100254 bytes
Desc: not available
Url : <http://two.pairlist.net/pipermail/reportlab-users/attachments/20090608/29186f30/attachment-0001.bin>


More information about the reportlab-users mailing list