[reportlab-users] pdf's corrupted when emailed, possible solution included
Jeff Johnson
reportlab-users@reportlab.com
Wed, 14 Apr 2004 11:28:38 -0400
Hi, we're using reportlab 1.18 and since we've switched to a postfix
mail server on linux, our PDFs that are sent from Outlook are
corrupted. This is apparently because Outlook uses quoted-printable
encoding instead of base64 to encode the PDF when it doesn't see binary
data in the first few lines of the attachment. According to the poster
in the link below, and the Adobe documentation, the solution to this
problem is to put a comment on the second line of the PDF with binary
characters in it so applications will know to treat the file as a binary
file.
I've downloaded 1.19 and didn't see anything in the change notes
regarding this and was wondering if there's an easy way to do it or even
to get it into the next reportlab release as a standard feature?
Text of link included below:
http://groups.google.com/groups?q=pdf+binary+comment+encoding+outlook&hl=en&lr=&ie=UTF-8&oe=UTF-8&selm=c1lkg9%241hlr%241%40FreeBSD.csie.NCTU.edu.tw&rnum=1
Regards,
Jeff
I think I worked it out.
The problem is with the *pdf*
Within the *PDF* its self - looking at the first few line
(1) Not working
%*PDF*-1.2
1 0 obj
<<
/Type /Catalog
/Pages 4 0 R
/Outlines 2 0 R
(2) Working *PDF*
%*PDF*-1.2
%=E2=E3=CF=D3
15 0 obj
<<
/Linearized 1
Notice the 4 *binary* characters after the header, i.e. after %*PDF*-1.2
I knew this was significant so I downloaded the *PDF* reference v 1.5 guide
from Adobe. Under chapter 3.4.1
<snip>
Note: If a *PDF* file contains *binary* data, as most do (see Section 3.1,
=93Lexical Conventions=94),it is recommended that the header line be
immediately followed by a *comment* line containing at least four *binary*
characters=97that is, characters whose codes are 128 or greater. This wil=
l
ensure proper behavior of file transfer applications that inspect data
near the beginning of a file to determine whether to treat the file=92s
contents as text or as *binary*.
</snip>
So this make sense from what I see, that *Outlook* considers the *pdf* as tex=
t
not *binary* and so uses quoted-printable.
Drat I can't blame M$
Regards Darryl