[reportlab-users] How can I edit PDF metadatas
Patrick Maupin
pmaupin at gmail.com
Mon Nov 30 13:11:14 EST 2009
I've added some new functionality, and an example that is more in line
with the original request (to alter the title metadata in a
preexisting PDF). The required code is basically:
trailer = PdfReader(inpfn)
trailer.Info.Title = 'My New Title Goes Here'
writer = PdfWriter()
writer.trailer = trailer
writer.write(outfn)
The full example is at:
http://code.google.com/p/pdfrw/source/browse/trunk/examples/alter.py
Regards,
Pat
On Mon, Nov 30, 2009 at 11:33 AM, Patrick Maupin <pmaupin at gmail.com> wrote:
> I second the thought that pdftk is easy to use (I use it all the
> time), but if your source files are not encrypted and don't have PDF
> 1.5 compressed object streams, my new pdfrw library might be easier to
> use than pypdf. For example, the way you can add metadata to the PDF
> trailer dictionary is:
>
> writer.trailer.Info = IndirectPdfDict(
> Title = 'your title goes here',
> Author = 'your name goes here',
> Subject = 'what is it all about?',
> Creator = 'some script goes here',
> )
>
> I have posted a complete working example:
>
> http://code.google.com/p/pdfrw/source/browse/trunk/examples/metadata.py
>
> Best regards,
> Pat
>
>
> On Mon, Nov 30, 2009 at 5:49 AM, Christian Jacobsen
> <cljacobsen at gmail.com> wrote:
>> You can also use pypdf. http://pybrary.net/pyPdf/
>>
>> This won't let you edit the metadata per se, but will let you read one
>> or more pdf file(s) and spit them back out, possibly with new
>> metadata. It is somewhat low level though. I find it useful for
>> automatically putting PDFs together for various things. I have
>> included an example below. I often also generate pages with replortlab
>> (ie separators between the PDFs that I am concatenating) which I
>> insert into the stream. I use the stringio module to capture the page
>> from reportlab and then feed it to pypdf's PdfFileReader. I have also
>> been known to stamp a running page number onto the pages by merging a
>> reportlab generated page with just a pagenumber with the input page.
>>
>> pdftk does all of this of course and is probably easier to use.
>>
>> Christian
>>
>>
>> import pyPdf
>> from pyPdf import PdfFileWriter, PdfFileReader
>>
>> OUTPUT = 'output.pdf'
>> INPUTS = ['test1.pdf', 'test2.pdf', 'test3.pdf']
>>
>> # There is no interface through pyPDF with which to set this other then getting
>> # your hands dirty like so:
>> infoDict = output._info.getObject()
>> infoDict.update({
>> NameObject('/Title'): createStringObject(u'title'),
>> NameObject('/Author'): createStringObject(u'author'),
>> NameObject('/Subject'): createStringObject(u'subject'),
>> NameObject('/Creator'): createStringObject(u'a script')
>> })
>>
>> inputs = [PdfFileReader(i) for i in INPUTS]
>> for input in inputs:
>> for page in range(input.getNumPages()):
>> output.addPage(input.getPage(page))
>>
>> outputStream = file(OUTPUT, 'wb')
>> output.write(outputStream)
>> outputStream.close()
>>
>>
>> 2009/11/26 Andy Robinson <andy at reportlab.com>:
>>> 2009/11/26 Dani Reguera <drbakhache at gmail.com>:
>>>> Can I open the file with reportlab and then set its title?
>>>
>>> reportlab creates files, but doesn't edit them.
>>>
>>> pdftk is writing the final file. I just googled it, and it has options
>>> to set the metadata on the command line. See 'update_info' on this
>>> page...
>>> http://www.accesspdf.com/pdftk/
>>>
>>> --
>>> Andy Robinson
>>> CEO/Chief Architect
>>> ReportLab Europe Ltd.
>>> _______________________________________________
>>> reportlab-users mailing list
>>> reportlab-users at lists2.reportlab.com
>>> http://two.pairlist.net/mailman/listinfo/reportlab-users
>>>
>> _______________________________________________
>> reportlab-users mailing list
>> reportlab-users at lists2.reportlab.com
>> http://two.pairlist.net/mailman/listinfo/reportlab-users
>>
>
More information about the reportlab-users
mailing list