New Forum | Previous | Next | (P-PDF) Beginners
Topic: finding text in PDF without viewing or extracting
Conf: (P-PDF) Beginners, Msg: 150178
Date: 5/23/2006 12:41 AM
I'm exploring some options with the PDF format, and have a few questions about finding strings in a PDF without opening it in any viewer or converting it to text.
Is it possible at all? Can I, for example, traverse the "tree" to locate a text stream and grab the first 100 words of the document? Can I guarantee that the first 100 words I find are the first 100 words of the document? Or, can I try to match a particular string?
Basically, how easy is it to go through all the layers of encoding and decompressing and extract a single string?
Thank you for your patience,