[reportlab-users] Platypus & unicode
Andrew Smart
smart at smart-knowhow.de
Tue Nov 13 19:18:19 EST 2007
Hi folks,
probably a common question but I haven't found a clear answer on the various
resources.
Problably a misunderstanding from my side.
I use Platypus, and I'm reading text files from the disk. Those textfiles
are coded with cp1252.
I load the text files as lines into memory, and I'm ensuring that every
single string is converted correctly to unicode - using the
text = unicode(text, "cp1252")
statement.
When I feed these strings into the Platypus framework I get unicode/decode
errors on various occasions. When I check the sources I find out that the
various .split() and .join() statements create "str" strings out of my
unicode strings. Those strings are then recoded into unicode using the
"utf-8" encoding, and here the conversion breaks.
Obviously, since my encoding is based on cp1252.
Arg.
What I understand: through unicode-str-unicode conversions inside Platypus
it is not the best idea to start with "cp1252".
Right?
Here my misunderstanding starts... Since I thoughted that the internal
unicode representation is independent from the encoding which is used to
store the strings in byte sequences or, e.g., in files. So splitting a
cp1252-encoded string "internally" inside Python routings should create
"str" which should be join'ed() and en/decoded back into unicode without
hassle...
But never mind.
Simple question: do I have to use utf-8 "coded" strings as input for
Platypus?
Any pointers are greatly appreciated :-))
Andrew
More information about the reportlab-users
mailing list