[reportlab-users] pdf's corrupted when emailed, possible solution included

Jeff Johnson reportlab-users@reportlab.com
Wed, 14 Apr 2004 11:28:38 -0400


Hi, we're using reportlab 1.18 and since we've switched to a postfix 
mail server on linux, our PDFs that are sent from Outlook are 
corrupted.  This is apparently because Outlook uses quoted-printable 
encoding instead of base64 to encode the PDF when it doesn't see binary 
data in the first few lines of the attachment.  According to the poster 
in the link below, and the Adobe documentation, the solution to this 
problem is to put a comment on the second line of the PDF with binary 
characters in it so applications will know to treat the file as a binary 
file. 

I've downloaded 1.19 and didn't see anything in the change notes 
regarding this and was wondering if there's an easy way to do it or even 
to get it into the next reportlab release as a standard feature?

Text of link included below:
http://groups.google.com/groups?q=pdf+binary+comment+encoding+outlook&hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=c1lkg9%241hlr%241%40FreeBSD.csie.NCTU.edu.tw&rnum=1

Regards,
Jeff

I think I worked it out.
The problem is with the *pdf*

Within the *PDF* its self - looking at the first few line
(1) Not working
%*PDF*-1.2
1 0 obj
<<
/Type /Catalog
/Pages 4 0 R
/Outlines 2 0 R

(2) Working *PDF*
%*PDF*-1.2
%=E2=E3=CF=D3
15 0 obj
<<
/Linearized 1


Notice the 4 *binary* characters after the header, i.e. after %*PDF*-1.2
I knew this was significant so I downloaded the *PDF* reference v 1.5 guide
from Adobe. Under chapter 3.4.1

<snip>
Note: If a *PDF* file contains *binary* data, as most do (see Section 3.1,
=93Lexical Conventions=94),it is recommended that the header line be
immediately followed by a *comment* line containing at least four *binary*
characters=97that is, characters whose codes are 128 or greater. This wil=
l
ensure proper behavior of file transfer applications that inspect data
near the beginning of a file to determine whether to treat the file=92s
contents as text or as *binary*.
</snip>

So this make sense from what I see, that *Outlook* considers the *pdf* as tex=
t
not *binary* and so uses quoted-printable.

Drat I can't blame M$

Regards Darryl