[reportlab-users] Splitting paragraph in table cell on '-' as well as white space
    Richard Galka 
    rgalka at seccuris.com
       
    Wed Jan 19 11:40:08 EST 2011
    
    
  
I noticed a question about breaking words/strings within a table via special characters occurred last month. This is something that we implemented and works well in our environment.
To ensure long words would break across tables, we subclassed the Paragraph class and modified the 'breakLines' method to either break on special characters or by using a dictionary file (www.gutenberg.org/ebooks/3204 in particular).
We are using ReportLab 2.4, and subclassed the Paragraph class adding two methods (breakOnSyntax and breakWithDictionary) and overriding the original Paragraphs __init__  and breakLines methods.
Hopefully this can help someone out, and again, please remember this is for ReportLab 2.4 and may not be relevant for other versions.
    def __init__(self, text, style,
                 bulletText = None,
                 frags = None,
                 caseSensitive=1,
                 encoding='utf8',
                 dictLoc=None,
                 breakOnSpecial=True):
        """
        Initialize the Paragraph structure and dictionary to use for breaking
        """
        self.breakOnSpecial = breakOnSpecial
        if dictLoc:
            self.breakOnDict=True
            self._dict = dictLoc
        else:
            self.breakOnDict=False
            self._dict = None
        Paragraph.__init__(self, text,style,bulletText,frags,caseSensitive,encoding)
    def breakOnSyntax(self, word, syntax=['-','/','\\']):
        """ Will split a word based on any symbol(s) identified
        Args:
            word: A string in ascii / utf to be split
            syntax: A list of characters in which the word may be split
                    The list is priorty based, and word will be split on first
                    character identified
        Return:
            Returns a list composed of:
                The split words
                Character chosen to split words
        """
        newwords = []
        syntaxused = ''
        for s in syntax:
            newwords = word.split(s)
            if len(newwords) > 1:
                syntaxused = s
                break
        if len(newwords)>1:
            return (newwords, s)
        else:
            return ([word], '')
    def breakWithDictionary(self, word, dictFile=None, dictBreakChar=None):
        """ Will split a word nearest the middle based on a dictionary file
        Args:
            word: A string in ascii / utf to be split
            dict: (optional) A Dictionary file identifying lexical makeup
        Return:
            Returns a list composed of
                The split words
        """
        if dictBreakChar is None:
            dictBreakChar = chr(165)  # Chr(165) used for Moby hyphenator II
        newwords = []
        if dictFile is None:
            dictFile = self._dict
        try:
            file = open(dictFile)
        except (IOError, TypeError):
            return [word]
        for line in file:
            line = line.strip()
            if word == line.replace(dictBreakChar, ''):
                #Logical break in the word identified
                splitword = line.split(dictBreakChar)
                for partialword in splitword:
                    newwords.append(partialword.replace(dictBreakChar,''))
                break
        if not newwords:
            # No logical break identified
            pass
        file.close()
        if newwords:
            return newwords
        else:
            return [word]
In the breakLines(self, width) method we copied the Paragraph method the below modified: (
            ...  #Original Method code
            ...
            wordcnt=0
            for word in words:
                wordcnt=wordcnt+1
                newwords = []
                #Make a word array splitting long words
                wordWidth = pdfmetrics.stringWidth(word, fontName, fontSize, self.encoding)
                newWidth = currentWidth + spaceWidth + wordWidth
                If newWidth <= maxWidth:
                    ...# Original Method Code
                    ...
                else:
                    if (newWidth-currentWidth) > maxWidth:
                        syntaxuse = ''
                        #Break apart word if appropriate
                        if breakOnSpecial:
                            (newwords,syntaxuse) = self.breakOnSyntax(word)
                        elif breakOnDict:
                            newwords = self.breakWithDictionary(word)
                        if newwords and len(newwords)==1:
                            word = newwords[0]
                        elif newwords and len(newwords)>1:
                            #split into two words
                            # Below attempts to split into two equal sized words.
                            # TODO: Identify a better 'join' method using 'maxWidth' and font metrics.
                            while(len(newwords)>2):
                                if len(newwords[0]) < len(newwords[-1]):
                                    # Append beginning
                                    tmp = newwords[0]
                                    newwords.remove(newwords[0])
                                    newwords[0] = tmp+syntaxuse+newwords[0]
                                else:
                                    tmp = newwords.pop()
                                    newwords[-1] = newwords[-1]+syntaxuse+tmp
                            #Place newword on wordlist
                            words.insert(wordcnt, newwords[1])
                            word = newwords[0]+syntaxuse
                        wordWidth = pdfmetrics.stringWidth(word, fontName, fontSize, self.encoding)
                        newWidth = currentWidth + spaceWidth + wordWidth
                ...
                ... #Original Method code
Richard Galka
Secure Software Analyst
Seccuris Inc.
100 - 321 McDermot Ave, Winnipeg, MB  R3A 0A3
Tel: (204) 255-4136 ext #219
Fax: (204) 942-6705
MSS Tel: 1-866-770-7958
MSS Email: MSS at seccuris.com<mailto:MSS at seccuris.com>
This communication, including any attachments, does not necessarily represent official policy of Seccuris Inc.
Please see http://www.seccuris.com/Contact-PrivacyPolicy.htm  for further details about Seccuris Inc.'s Privacy Policy.
If you have received this communication in error, please notify Seccuris Inc. at info at seccuris.com or at 1-866-644-8442.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://two.pairlist.net/pipermail/reportlab-users/attachments/20110119/e3a4d0ab/attachment-0001.htm>
    
    
More information about the reportlab-users
mailing list