[reportlab-users] BUGFIX: Re:   in paragraph

Dirk Holtwick dirk.holtwick at gmail.com
Fri Dec 5 04:42:22 EST 2008



> With your original version I found some slight issues related to

> multiple space chars resulting in null elements. Can you try this

> version for size? It basically just adds a + after the charset in the re

> so that u'a\x\a0b\n\n\nc' splits in 2 elements not 4.


I tested it and it works fine. Another suggestion is not to test for
"\x0a" any more to profit from the more elaborated whitespace table for
usual cases. Here is my modification:

-----------------8<---------------[cut here]
def split(text, delim=None):
if type(text) is str:
text = text.decode('utf8')
if type(delim) is str:
delim = delim.decode('utf8')
elif delim is None:
return [uword.encode('utf8') for uword in _wsc_re_split(text)]
return [uword.encode('utf8') for uword in text.split(delim)]
-----------------8<---------------[cut here]

Dirk


More information about the reportlab-users mailing list