Planet PDF Forum Archive

Planet PDF ForumWowsers! This is page is old, head to the LIVE Planet PDF Forum. It features more than 10 conferences, covering everything from beginner to in-depth developer and pre-press discussions. If you wish to continue... one & two archive covers 1999-2011 (160,000 pages).


New Forum | Previous | Next | (P-PDF) Beginners


Topic: Text in CreateWordHilite does not correspond to page text .
Conf: (P-PDF) Beginners, Msg: 82187
From: nvanraalte
Date: 3/4/2003 01:28 AM

I am writing a VB application which uses the Acrobat object model to insert highlights over certain words. My problem is that the selection rectangle I create for a certain word index does not contain the text corresponding to that word as extracted from the Page.

The sequence of operations I am trying to perform is:
1. Open the document and get the Page to determine the Page Size using Acrobat.CAcroPDPage.GetSize.
2. Create a Acrobat.CAcroRect which I can use with Acrobat.CAcroPDDoc.CreateTextSelect to get a Acrobat.CAcroPDTextSelect object for the entire page.
3. Build the page text buffer of all the text on the page in VB by calling Acrobat.CAcroPDTextSelect.GetText.
4. In VB search the page text buffer for specific words, patterns or phrases.
5. Using the word number (i.e. the index number for Acrobat.CAcroPDTextSelect.GetText corresponding to the matching word) of the text matching the search, add a highlight to an empty Acrobat.CAcroHiliteList.
6. The problem is that if I get a Acrobat.CAcroPDTextSelect from Acrobat.CAcroPDPage.CreateWordHilite and I compare the results of Acrobat.CAcroPDTextSelect.GetText with the matching text in the page text buffer, sometimes it is different. Typically words that match early on in the page text buffer can be successfully highlighted while words towards the end of the page text buffer are the ones that are less likely to match the text in the highlighted area.

The reason why I can't use any of the search techniques already available in Acrobat is that I need to be able to build my own custom searches which may interact with other things, such as databases, text files etc...

Has anyone else tried this successfully or had similar problems?
So far I can only think of the following possible explanations:
1. I have not selected all the text on the page when building the page text buffer (even if I set the page selection to be huge I get the same text back and the same problems).
2. Hidden characters or words which are not exposed by the GetText which I am using to compile the text (difficult to compensate for).
3. Can existing highlights or other document content or formatting prevent selected rectangles retrieving correct text (if so what would the requirements of a document need to be for my technique to work?).

Any suggestions or pointers would be greatly appreciated, thank you...

PDF In-Depth Free Product Trials Ubiquitous PDF

Debenu Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link...

Download free demo

Debenu PDF Tools Pro

It's simple to use and will let you preview and edit PDF files, it's a Windows application that makes...

Download free demo

Back to the past, 15 years ago! Open Publish 2002

Looking back to 2002, it's amazing how much of the prediction became a reality. Take a read and see what you think!

September 14, 2017
Platinum Sponsor





Search Planet PDF
more searching options...
Planet PDF Newsletter
Most Popular Articles
Featured Product

Debenu PDF Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link control. Take Acrobat to the next level.

Features

Adding a PDF Stamp Comment

OK, so you want to stamp your document. Maybe you need to give reviewers some advice about the document's status or sensitivity. This tip from author Ted Padova demonstrates how to add stamps with the Stamp Tool along with related comments.