[reportlab-users] How can I edit PDF metadatas

Patrick Maupin pmaupin at gmail.com
Mon Nov 30 12:33:30 EST 2009


I second the thought that pdftk is easy to use (I use it all the
time), but if your source files are not encrypted and don't have PDF
1.5 compressed object streams, my new pdfrw library might be easier to
use than pypdf. For example, the way you can add metadata to the PDF
trailer dictionary is:

writer.trailer.Info = IndirectPdfDict(
Title = 'your title goes here',
Author = 'your name goes here',
Subject = 'what is it all about?',
Creator = 'some script goes here',
)

I have posted a complete working example:

http://code.google.com/p/pdfrw/source/browse/trunk/examples/metadata.py

Best regards,
Pat


On Mon, Nov 30, 2009 at 5:49 AM, Christian Jacobsen
<cljacobsen at gmail.com> wrote:

> You can also use pypdf. http://pybrary.net/pyPdf/

>

> This won't let you edit the metadata per se, but will let you read one

> or more pdf file(s) and spit them back out, possibly with new

> metadata. It is somewhat low level though. I find it useful for

> automatically putting PDFs together for various things. I have

> included an example below. I often also generate pages with replortlab

> (ie separators between the PDFs that I am concatenating) which I

> insert into the stream. I use the stringio module to capture the page

> from reportlab and then feed it to pypdf's PdfFileReader. I have also

> been known to stamp a running page number onto the pages by merging a

> reportlab generated page with just a pagenumber with the input page.

>

> pdftk does all of this of course and is probably easier to use.

>

>  Christian

>

>

> import pyPdf

> from pyPdf import PdfFileWriter, PdfFileReader

>

> OUTPUT = 'output.pdf'

> INPUTS  = ['test1.pdf', 'test2.pdf', 'test3.pdf']

>

> # There is no interface through pyPDF with which to set this other then getting

> # your hands dirty like so:

> infoDict = output._info.getObject()

> infoDict.update({

>    NameObject('/Title'): createStringObject(u'title'),

>    NameObject('/Author'): createStringObject(u'author'),

>    NameObject('/Subject'): createStringObject(u'subject'),

>    NameObject('/Creator'): createStringObject(u'a script')

>    })

>

> inputs = [PdfFileReader(i) for i in INPUTS]

> for input in inputs:

>    for page in range(input.getNumPages()):

>        output.addPage(input.getPage(page))

>

> outputStream = file(OUTPUT, 'wb')

> output.write(outputStream)

> outputStream.close()

>

>

> 2009/11/26 Andy Robinson <andy at reportlab.com>:

>> 2009/11/26 Dani Reguera <drbakhache at gmail.com>:

>>> Can I open the file with reportlab and then set its title?

>>

>> reportlab creates files, but doesn't edit them.

>>

>> pdftk is writing the final file. I just googled it, and it has options

>> to set the metadata on the command line.  See 'update_info' on this

>> page...

>>    http://www.accesspdf.com/pdftk/

>>

>> --

>> Andy Robinson

>> CEO/Chief Architect

>> ReportLab Europe Ltd.

>> _______________________________________________

>> reportlab-users mailing list

>> reportlab-users at lists2.reportlab.com

>> http://two.pairlist.net/mailman/listinfo/reportlab-users

>>

> _______________________________________________

> reportlab-users mailing list

> reportlab-users at lists2.reportlab.com

> http://two.pairlist.net/mailman/listinfo/reportlab-users

>



More information about the reportlab-users mailing list