[reportlab-users] utf-problem

"Vázquez Cascales, Sebastián" reportlab-users@reportlab.com
Fri, 26 Mar 2004 11:45:48 +0100


	Hi all

	I'm a newbie in Plone developing and I have a problem about reportlab. One of the new features of plone includes an utility that uses reportlab. This program generates a pdf from an element (a kind of document). This feature works fine in english, but crashes when I try spanish.

the original code is

def utf8(text):
    """ Unicode -> UTF8 """
    assert isinstance(text, UnicodeType)
    return text.encode('utf-8')

the error was: Invalid UTF-8 string 


 I've been taking a look at your mailing list and found what I thought it was the solution.

(literal...)
----------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------
Unicode strings are not present in Python 1.5.2.  With 2.0 or later you
can do

  c.drawString(100, y, any_unicode_string.encode("UTF-8"))

I'm not sure what happens if you write u"text in some 8-bit encoding" --
what encoding will be chosen by Python interpret those 8-bit characters?
To be safe you could write

  c.drawString(100, y, unicode("Latin-1 text", "ISO-8859-1").encode("UTF-8"))

It might be easier to use UTF-8 directly.

I wonder... is there a way to support Unicode string literals on Python
2+ in a way that's compatible with Python 1.5.2?  Basically you'd need
to change parse_utf8() in ttfonts.py to something like

  if type(string) is UnicodeType:
      return map(ord, string)

and above that declare

  try:
    from types import UnicodeType
  except:
    # Old Python has no Unicode strings
    UnicodeType = None

Not tested, but you get the idea.  Then you could just pass Unicode
strings to drawString if you have Python 2+.

(Platypus would need a couple of additional lines to cope with Unicode
strings.)


Marius Gedminas
----------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------
so I made a change
def utf8(text):
    """ Unicode -> UTF8 """
    return unicode(text, "ISO-8859-1").encode('utf-8')

but now the error is: decoding Unicode is not supported 

I've been searching and I've found http://lists.fourthought.com/pipermail/4suite/2001-February/001335.html, but I'm not sure I understand everything.

and I don´t know what else can I do

I've tried with reportlabs 1.18 and 1.19. Python 2.3.3

I can give you more info. 

thanks in advance



Sebastián Vázquez Cascales
SADIEL, S.A.
Pabellón de Portugal
Isaac Newton s/n
Isla de la Cartuja
41092 SEVILLA
Tel. 955 04 36 00
Fax: 955 04 36 01
http://www.sadiel.es/ e-mail: svazquez@sadiel.es