[reportlab-users] pyRXP and processing instructions

Robin Becker reportlab-users@reportlab.com
Tue, 4 Mar 2003 17:29:51 +0000


In article <F157A892-4DC1-11D7-B72B-000393B63DDC@shangri-
la.dropbear.id.au>, Stuart Bishop <zen@shangri-la.dropbear.id.au> writes
......
>
>I personally don't like comments being returned as '<!-- foo -->'.
>I was going to make <?processing instructions?> return the same way
>for compatibility, but want to alter the representation slightly (if
>a flag is set). I think this is best done either as:
>       - Comments & processing instructions are returned as an element
>         tuple rather than a string. You can tell it is a special node
>      as t[0] in (pyRXP.Comment, pyRXP.ProcessingInstruction). This is
>      problematic for a processing instruction, as you really want it
>      split into 'command' and 'arguments'.
>    - A 'Comment' or 'ProcessingInstruction' class is returned instead
>      of the element tuple.
>
I think using classes rather than simple builtin types is a bad idea. If
we go tree walking regularity is everything. I prefer to do special
nodes in the existing format rather than create special classes. Using
the builtin types means we can use marshal which is fast for
persistence. 

My suggestion is that we return 
comments and processing instructions as

        ('<!--',None,['comment text'],None)
and
        ('<?',None,['name','processing instruction text'],None)

An alternative for the latter would be 

        ('<?',{name: 'name'},['processing instruction text'],None)

Others have suggested using special integers for the names of the
special nodes. I'm not averse to that, but find the string constants a
bit easier to look at. If people are worried about doing hashes etc it
would be easy to add pyRXP.commentNodeName,
pyRXP.processingInstructionName etc as module strings.

Unless there are clear advantages to doing something other than the
above I will put this in over the weekend and call the result 0.95. 

>In theory, we could return more intelligent classes than 'tuple' now
>without sacrificing backwards compatibility, speed or memory by 
>subclassing
>tuple (?). Perhaps a real life DOM tree could be returned via 
>parser.Parse()?
>I havn't looked to closely at this yet - it is still in blue sky stage.
>
>And while I'm here - RXP 1.3 has been released and apparently fixes 
>some memory
>leaks and stuff. It drops in happily over the existing RXP, except that 
>myWarnCB
>in pyRXP.c breaks. Everything still works fine (except for the warning 
>callback)
>if I make myWarnCB just 'return'.
>

-- 
Robin Becker