[reportlab-users] Reducing use of 7-bit characters

Yoann Roman yroman-reportlab at altalang.com
Tue May 5 18:06:02 EDT 2009



> However, just using FlateFilter (basically using gzip.compress)

> without the Ascii85 would yield a nicely binary stream.


That's essentially what my first pass at making Ascii85 did. Acrobat 9
didn't have any problems with it, but I know it's lenient. Outlook
encoded this as base64 with no transmission problems.


> LF versus CRLF again came from looking at what Distiller

> produced. But maybe a better approach is to get rid of 95%

> the "line wrapping" altogether.


The only "line wrapping" I see is with the non-encoded PDF markup. I
don't have wrapA85 on, though, but I thought that was off by default.


> I went to significant lengths to make sure the raw PDF files were

> "readable" in an editor, wrapping in sensible places, because we

> spent a LOT of time from 1998-2003 staring at the innards of PDF

> files. It would be very easy to "not line-wrap" much

> of the content. Maybe Outlook is noticing the formatting more than

> the coding when assuming this is text.


I can't claim to know what Outlook is doing, but I did do a trial
removing all newlines; Outlook still encoded it quoted-printable. I
would definitely make turning off these things optional so that they
don't affect anyone else and can be turned back on for debugging. I
really appreciated the fact that I could "see" problems in my PDFs.


> In summary, let's look at a 'binary' option which does both,

> and see if that fools Outlook.


Turning off Ascii85 definitely does. The CRLF change doesn't seem to,
but at least Outlook encodes the newlines, which it should help.

When you say "let's look", does that mean internal to ReportLab?

Thanks for the detailed explanation,

--
Yoann Roman



More information about the reportlab-users mailing list