Planet PDF Forum Archive

Planet PDF ForumWowsers! This is page is old, head to the LIVE Planet PDF Forum. It features more than 10 conferences, covering everything from beginner to in-depth developer and pre-press discussions. If you wish to continue... one & two archive covers 1999-2011 (160,000 pages).


New Forum | Previous | Next | (P-PDF) Developers


Topic: Re: Get Pdf Content
Conf: (P-PDF) Developers, Msg: 57167
From: LeonardR
Date: 5/29/2002 05:24 PM

At 10:18 AM -0500 2/7/01, p-pdf-developer Listmanager wrote:
>[1 g
>, /GS1 gs
>, 1 i
>, 0 0 612 792 re
>, f
>, BT
>, /F4 1 Tf
>, 10 0 0 10 76.5 690.42 Tm
>, 0 g
>, 0 Tc
>, (208) Tj
>, /F2 1 Tf
>, 8 0 0 8 256.72 690.42 Tm
>, 0 Tw
>, (JNI TECHNOLOGY) Tj
>, /F11 1 Tf
>, 14 0 0 14 79.5 656.67 Tm
>, (About the Example) Tj
>, /F6 1 Tf
>...etc
>
>The text is only in ( ) and i would like to know if something in the
>etymon Library to get only the text. Thanx for u help
>
I don't believe that there is anything in Etymon PJ that will
help you here, and getting text out of a PDF is a LOT harder than
just finding those ()'s.

1. ()'s are valid in the text stream, provided they are escaped - "(\(x\))"
2. Text can also be found inside an array of strings - "[(This)37(is)]"
3. Text isn't necessarily in top->bottom, left->right order, NOR does
it necessarily have to include spaces, line/paragraph breaks, tabs,
etc. In the above example (2), the text will draw as "This is", but
there is no space in the data stream.
4. Text isn't entirely valid without font/encoding context, since the
characters in those streams are really just indices into the font's
character table - which may be full or may be subset, and may also
include a custom ending (not to mention simple encoding issues like
MacRoman vs. WinANSI).

In other words, have fun ;).


Leonard
--
----------------------------------------------------------------------------
Leonard Rosenthol
Sr. Software Engineer (215) 922-3509 (voice)
Digital Applications (215) 440-0504 (fax)

PGP Fingerprint: 8CC9 8878 921E C627 0BC1 15BB FC19 64A9 0016 1397



PDF In-Depth Free Product Trials Ubiquitous PDF

Debenu Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link...

Download free demo

Debenu PDF Tools Pro

It's simple to use and will let you preview and edit PDF files, it's a Windows application that makes...

Download free demo

Back to the past, 15 years ago! Open Publish 2002

Looking back to 2002, it's amazing how much of the prediction became a reality. Take a read and see what you think!

September 14, 2017
Platinum Sponsor





Search Planet PDF
more searching options...
Planet PDF Newsletter
Most Popular Articles
Featured Product

Debenu PDF Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link control. Take Acrobat to the next level.

Features

Adding a PDF Stamp Comment

OK, so you want to stamp your document. Maybe you need to give reviewers some advice about the document's status or sensitivity. This tip from author Ted Padova demonstrates how to add stamps with the Stamp Tool along with related comments.