[reportlab-users] speeding up parse_utf8?

Robin Becker reportlab-users@reportlab.com
Mon, 13 Oct 2003 13:02:49 +0100


Andy suggested speeding up ttfonts by using the built in codecs to improve parse_utf8

e.g.

try:
    import codecs
    parse_utf8=lambda x, decode=codecs.lookup('utf8')[1]: map(ord,decode(x)[0])
    del codecs
except:
    def parse_utf8(string):
       ......

but my tests with this code

#########################
from time import time
import codecs
from reportlab.pdfbase.ttfonts import parse_utf8
nparse_utf8=lambda x, decode=codecs.lookup('utf8')[1]: map(ord,decode(x)[0])
assert nparse_utf8('abcdefghi')==parse_utf8('abcdefghi')

for fn in (parse_utf8,nparse_utf8):
        t0 = time()
        for i in xrange(500):
                map(fn,i*'abcdefghi')
        print str(fn), time()-t0
#########################

show Marius' code is faster. 

C:\Python\reportlab\test>\tmp\ttt.py
<function parse_utf8 at 0x00843260> 22.7929999828
<function <lambda> at 0x00911CA0> 26.0670000315


I thought these decoders were supposed to be very fast. 
-- 
Robin Becker