[reportlab-users] pyRXP vs ONIX DTD

Robin Becker reportlab-users@reportlab.com
Thu, 5 Dec 2002 20:01:49 +0000


In article <20021205121722.GA27836@codeworks.lt>, Marius Gedminas <marius@codeworks.lt> writes


At http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8

I see this table

U-00000000 - U-0000007F: 0xxxxxxx
U-00000080 - U-000007FF: 110xxxxx 10xxxxxx
U-00000800 - U-0000FFFF: 1110xxxx 10xxxxxx 10xxxxxx
U-00010000 - U-001FFFFF: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
U-00200000 - U-03FFFFFF: 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
U-04000000 - U-7FFFFFFF: 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx

so I guess I could try and hack RXP to just accept this table for any of the
high end characters. I'm not sure I would want to do on the fly translation
for all files though. Somehow I had assumed that could not be done.
Can the python stuff do this for us?
>(I'm not sure that's a good idea; Unicode is 20.1 bits wide, and UTF-16
>combines all the disadvantages of both UTF-8 and UTF-32.)
>
>Marius Gedminas

-- 
Robin Becker