[reportlab-users] Re: Platypus XML markup suggestion
Christoph Zwerschke
reportlab-users@reportlab.com
Mon, 26 May 2003 13:54:02 +0200
> OK I think I added all that in to CVS. Can you check to see if I
> misunderstood your intent?
Looks very good.
I only noticed I made a mistake with thetasym and sigmaf: they have to be
assigned to the uppercase letters (as an exception from the rule), i.e.
----------
change
977:'j', # thetasym
to
977:'J', # thetasym
change
'thetasym': 'j',
'thetav': 'j',
to
'thetasym': 'J',
'thetav': 'J',
change
962:'v', # sigmaf
to
962:'V', # sigmaf
change
'sigmaf': 'v',
to
'sigmaf': 'V',
also change
981:'f', # phiv
to
981:'f', # phis
(to avoid confusion)
-----------
I also noticed that the function handle_entityref checks the existence of
the entity using has_key, not by catching an exception. To make the code
more consistent, handle_charref should be coded in the same way as
handle_entityref:
def handle_charref(self, name):
try:
if name[0] == 'x':
n = string.atoi(name[1:], 16)
else:
n = string.atoi(name)
except string.atoi_error:
self.unknown_charref(name)
return
if 0 <= n <= 255:
self.handle_data(chr(n))
elif symenc.has_key(n):
self._push(greek=1)
self.handle_data(symenc[n])
self._pop(greek=1)
else:
self.unknown_charref(name)
-------------
I think if the XHTML named entities for special characters and symbols are
included, we should include the standard XHTML named entities for Latin-1
characters as well, so that the complete XHTML named entity set will be
supported. Then you will be able to not only write α for a greek alpha
but also you can write ä for a German a-Umlaut.
For the Latin-1 characters, you have to add the following to
handle_entityref:
def handle_entityref(self,name):
if lat1.has_key(name):
self.handle_data(lat1[name])
elif greeks.has_key(name):
self._push(greek=1)
self.handle_data(greeks[name])
self._pop(greek=1)
else:
xmllib.XMLParser.handle_entityref(self,name)
# Latin-1 characters
lat1 = {
'nbsp':'\240',
'iexcl':'\241',
'cent':'\242',
'pound':'\243',
'curren':'\244',
'yen':'\245',
'brvbar':'\246',
'sect':'\247',
'uml':'\250',
'copy':'\251',
'ordf':'\252',
'laquo':'\253',
'not':'\254',
'shy':'\255',
'reg':'\256',
'macr':'\257',
'deg':'\260',
'plusmn':'\261',
'sup2':'\262',
'sup3':'\263',
'acute':'\264',
'micro':'\265',
'para':'\266',
'middot':'\267',
'cedil':'\270',
'sup1':'\271',
'ordm':'\272',
'raquo':'\273',
'frac14':'\274',
'frac12':'\275',
'frac34':'\276',
'iquest':'\277',
'Agrave':'\300',
'Aacute':'\301',
'Acirc':'\302',
'Atilde':'\303',
'Auml':'\304',
'Aring':'\305',
'AElig':'\306',
'Ccedil':'\307',
'Egrave':'\310',
'Eacute':'\311',
'Ecirc':'\312',
'Euml':'\313',
'Igrave':'\314',
'Iacute':'\315',
'Icirc':'\316',
'Iuml':'\317',
'ETH':'\320',
'Ntilde':'\321',
'Ograve':'\322',
'Oacute':'\323',
'Ocirc':'\324',
'Otilde':'\325',
'Ouml':'\326',
'times':'\327',
'Oslash':'\330',
'Ugrave':'\331',
'Uacute':'\332',
'Ucirc':'\333',
'Uuml':'\334',
'Yacute':'\335',
'THORN':'\336',
'szlig':'\337',
'agrave':'\340',
'aacute':'\341',
'acirc':'\342',
'atilde':'\343',
'auml':'\344',
'aring':'\345',
'aelig':'\346',
'ccedil':'\347',
'egrave':'\350',
'eacute':'\351',
'ecirc':'\352',
'euml':'\353',
'igrave':'\354',
'iacute':'\355',
'icirc':'\356',
'iuml':'\357',
'eth':'\360',
'ntilde':'\361',
'ograve':'\362',
'oacute':'\363',
'ocirc':'\364',
'otilde':'\365',
'ouml':'\366',
'divide':'\367',
'oslash':'\370',
'ugrave':'\371',
'uacute':'\372',
'ucirc':'\373',
'uuml':'\374',
'yacute':'\375',
'thorn':'\376',
'yuml':'\377'
}
Instead of modifying handle_entityref you could probably also do:
xmllib.XMLParser.entitydefs.update(lat1)
I checked the above under Windows (WinAnsi encoding). Maybe you have to add
something to provide for MacRoman encoding, too.