[reportlab-users] Reportlab text not searchable in Apple OSX Preview.App? but searchable in Acrobat and google-pdf viewer ?
hari jayaram
harijay at gmail.com
Wed Aug 5 09:05:53 EDT 2009
I also tried the test code :import reportlab.pdfgen.canvas as canvas
c = canvas.Canvas("searchtext.pdf")
c.rotate(90)
textobj = c.beginText(10,-10)
textstring = "Dispense File Prefix: %s" % "hello world"
textobj.textLines(textstring)
c.drawText(textobj)
c.rotate(-90)
c.showPage()
c.save()
And that gave the attached pdf which fails search for any of the contained
text in Preview.App ( Picture 21 .png)
I dont know what I am doing wrong
Thanks for your help troubleshooting this. Hope the problem does not lie in
Preview.App because the same pdfs are searchable in google pdf ( gmail)
reader and in Acrobat .
Hari
On Wed, Aug 5, 2009 at 9:02 AM, hari jayaram <harijay at gmail.com> wrote:
> Hi Bill and Robin ,Thanks for your replies
> I tried the textobject way of writing out the string based on the code
> snippet you provided . However , the Preview search still does not work. I
> think I implemented what you suggested.
>
> The code snippet is shown here . The older version of my code was using
> canvas drawString methods to render the string . The full code is on the
> github link ( see below) . That too had the same effect of giving
> unsearchable text.
>
> self.canvas_obj.rotate(90)
> textobj = self.canvas_obj.beginText(10,-10)
> # textobj.setTextRenderMode(INVISIBLE_MODE)
> textstring = "Dispense File Prefix: %s" %
> str(os.path.splitext(self.filename)[0] )
> textstring = textstring.strip().encode('latin-1', 'replace')
> textobj.textLines(textstring)
> self.canvas_obj.drawText(textobj)
> # self.canvas_obj.drawString(10,-10,"DispenseFilePrefix: %s" %
> str(os.path.splitext(self.filename)[0] ))
> self.canvas_obj.rotate(-90)
>
> Robin, I am trying to search for the Text in the Preview.App search Bar .
> This search bar works for every "text" pdf document I have . However If you
> see the attached png image or the pdf document , Preview thinks each word
> has several spaces in it ..so though visibly the word is present .
> Semantically it just seems to be a sequence of alphabets .
>
> Attachments : github source for report with drawstring methods:
> http://github.com/harijay/protein-crystallization-gridmaker/blob/1ca03fd8aa85cd18b93ac63ff6447199d9799dcb/platepdfwriter.py
> Png Image showing search result : Picture 20.png, dispense_not_found.png
> pdf file failing search rendered with drawText code ( based on Bill Janssens
> suggestion): test2.pdf
>
> Hari
>
>
>
> On Tue, Aug 4, 2009 at 10:27 PM, Bill Janssen <janssen at parc.com> wrote:
>
>> hari jayaram <harijay at gmail.com> wrote:
>>
>> > I noticed however that the text laden pdfs I am rendering are not
>> searchable
>> > using Apple Mac (Leopard) OSX Preview.App
>> >
>> > When I use the built in search within Preview.App only single characters
>> > light up ( only single characters show matches like a , b , c , d ) No
>> words
>> > light up..
>>
>> Works fine for me, generating PDFs with ReportLab 2.2 and searching with
>> Preview.
>>
>> I add my text to the PDF a word at a time, with this code:
>>
>> textobj = mycanvas.beginText(word.left, word.baseline)
>> textobj.setTextRenderMode(INVISIBLE_MODE)
>> textstring = word.text.strip().encode('latin-1', 'replace')
>> textobj.textLines(textstring)
>> mycanvas.drawText(textobj)
>>
>> Incidentally, can I switch to UTF-8 these days?
>>
>> Bill
>> _______________________________________________
>> reportlab-users mailing list
>> reportlab-users at reportlab.com
>> http://two.pairlist.net/mailman/listinfo/reportlab-users
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://two.pairlist.net/pipermail/reportlab-users/attachments/20090805/eaa85a05/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: searchtext.pdf
Type: application/pdf
Size: 1918 bytes
Desc: not available
Url : <http://two.pairlist.net/pipermail/reportlab-users/attachments/20090805/eaa85a05/attachment-0001.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picture 21.png
Type: image/png
Size: 37790 bytes
Desc: not available
Url : <http://two.pairlist.net/pipermail/reportlab-users/attachments/20090805/eaa85a05/attachment-0001.png>
More information about the reportlab-users
mailing list