[reportlab-users] Re: Platypus XML markup suggestion

Christoph Zwerschke reportlab-users@reportlab.com
Mon, 26 May 2003 13:54:02 +0200


> OK I think I added all that in to CVS. Can you check to see if I
> misunderstood your intent?

Looks very good.

I only noticed I made a mistake with thetasym and sigmaf: they have to be
assigned to the uppercase letters (as an exception from the rule), i.e.

----------

change
977:'j', # thetasym
to
977:'J', # thetasym

change
'thetasym': 'j',
'thetav': 'j',
to
'thetasym': 'J',
'thetav': 'J',

change
962:'v', # sigmaf
to
962:'V', # sigmaf

change
'sigmaf': 'v',
to
'sigmaf': 'V',

also change
981:'f', # phiv
to
981:'f', # phis
(to avoid confusion)

-----------

I also noticed that the function handle_entityref checks the existence of
the entity using has_key, not by catching an exception. To make the code
more consistent, handle_charref should be coded in the same way as
handle_entityref:

def handle_charref(self, name):
    try:
        if name[0] == 'x':
            n = string.atoi(name[1:], 16)
        else:
            n = string.atoi(name)
    except string.atoi_error:
        self.unknown_charref(name)
        return
    if 0 <= n <= 255:
        self.handle_data(chr(n))
    elif symenc.has_key(n):
        self._push(greek=1)
        self.handle_data(symenc[n])
        self._pop(greek=1)
    else:
        self.unknown_charref(name)

-------------

I think if the XHTML named entities for special characters and symbols are
included, we should include the standard XHTML named entities for Latin-1
characters as well, so that the complete XHTML named entity set will be
supported. Then you will be able to not only write &alpha; for a greek alpha
but also you can write &auml; for a German a-Umlaut.

For the Latin-1 characters, you have to add the following to
handle_entityref:

def handle_entityref(self,name):
    if lat1.has_key(name):
        self.handle_data(lat1[name])
    elif greeks.has_key(name):
        self._push(greek=1)
        self.handle_data(greeks[name])
        self._pop(greek=1)
    else:
        xmllib.XMLParser.handle_entityref(self,name)

# Latin-1 characters
lat1 = {
    'nbsp':'\240',
    'iexcl':'\241',
    'cent':'\242',
    'pound':'\243',
    'curren':'\244',
    'yen':'\245',
    'brvbar':'\246',
    'sect':'\247',
    'uml':'\250',
    'copy':'\251',
    'ordf':'\252',
    'laquo':'\253',
    'not':'\254',
    'shy':'\255',
    'reg':'\256',
    'macr':'\257',
    'deg':'\260',
    'plusmn':'\261',
    'sup2':'\262',
    'sup3':'\263',
    'acute':'\264',
    'micro':'\265',
    'para':'\266',
    'middot':'\267',
    'cedil':'\270',
    'sup1':'\271',
    'ordm':'\272',
    'raquo':'\273',
    'frac14':'\274',
    'frac12':'\275',
    'frac34':'\276',
    'iquest':'\277',
    'Agrave':'\300',
    'Aacute':'\301',
    'Acirc':'\302',
    'Atilde':'\303',
    'Auml':'\304',
    'Aring':'\305',
    'AElig':'\306',
    'Ccedil':'\307',
    'Egrave':'\310',
    'Eacute':'\311',
    'Ecirc':'\312',
    'Euml':'\313',
    'Igrave':'\314',
    'Iacute':'\315',
    'Icirc':'\316',
    'Iuml':'\317',
    'ETH':'\320',
    'Ntilde':'\321',
    'Ograve':'\322',
    'Oacute':'\323',
    'Ocirc':'\324',
    'Otilde':'\325',
    'Ouml':'\326',
    'times':'\327',
    'Oslash':'\330',
    'Ugrave':'\331',
    'Uacute':'\332',
    'Ucirc':'\333',
    'Uuml':'\334',
    'Yacute':'\335',
    'THORN':'\336',
    'szlig':'\337',
    'agrave':'\340',
    'aacute':'\341',
    'acirc':'\342',
    'atilde':'\343',
    'auml':'\344',
    'aring':'\345',
    'aelig':'\346',
    'ccedil':'\347',
    'egrave':'\350',
    'eacute':'\351',
    'ecirc':'\352',
    'euml':'\353',
    'igrave':'\354',
    'iacute':'\355',
    'icirc':'\356',
    'iuml':'\357',
    'eth':'\360',
    'ntilde':'\361',
    'ograve':'\362',
    'oacute':'\363',
    'ocirc':'\364',
    'otilde':'\365',
    'ouml':'\366',
    'divide':'\367',
    'oslash':'\370',
    'ugrave':'\371',
    'uacute':'\372',
    'ucirc':'\373',
    'uuml':'\374',
    'yacute':'\375',
    'thorn':'\376',
    'yuml':'\377'
    }

Instead of modifying handle_entityref you could probably also do:
xmllib.XMLParser.entitydefs.update(lat1)

I checked the above under Windows (WinAnsi encoding). Maybe you have to add
something to provide for MacRoman encoding, too.