Planet PDF Forum Archive

Planet PDF ForumWowsers! This is page is old, head to the LIVE Planet PDF Forum. It features more than 10 conferences, covering everything from beginner to in-depth developer and pre-press discussions. If you wish to continue... one & two archive covers 1999-2011 (160,000 pages).


New Forum | Previous | Next | (P-PDF) Developers


Topic: /CharSet - Text Related
Conf: (P-PDF) Developers, Msg: 142918
From: ved_jai
Date: 11/18/2005 10:45 PM

Hi all.

I have a question regarding "/CharSet" entry in the Font Descriptor of a
Font Dictionary.

I am looking to extract text from a PDF file. I have detailed the various
indirect objects which may be needed to support my question. In the PDF
File there is simple text like "3.32 X Configuration - Monitor and
Customization" . In this text run there are 2 characters which I can not
get through - one is "fi" and the other is "-"(emdash).

The text run in the PDF document is given as

.....
24.0782 0 Td
(X)Tj
9.111 0 Td
(Con\002guration)Tj
65.792 0 Td
(\227)Tj
12.3556 0 Td
(Monitor)Tj
.....

Here We see that there are 2 octal values used. One is for "fi" (\002) and
the other is for emdash (\227)
The /CharSet entry has both /fi and /emdash BUT HOW SHOULD THE
MAPPING BE DONE FOR THESE VALUES. I cannot find any relation into
the values used to describe the characters. Is there any way to do this ?
How does the original PDF understand what to do with the octal values
and map it correctly to the values fi and emdash ?

Thanks and Regards
Jai Praful Ved

Indirect Objects necessary for better understanding of the situation :

12 0 obj
<<
/Subtype /Type1
/BaseFont /UGAQOP+Helvetica-Bold~43
/Type /Font
/Widths [ 333...... 556 ]
/FontDescriptor 14 0 R
/FirstChar 1
/LastChar 255
>>
endobj

14 0 obj
<<
/Type /FontDescriptor
/FontName /UGAQOP+Helvetica-Bold~43
/FontBBox [ -173 -307 1176 949 ]
/Flags 4
/Ascent 949
/CapHeight 949
/Descent -307
/ItalicAngle 0
/StemV 176
/MissingWidth 1000
/CharSet (/fi/quoteright/hyphen/period/zero/one/two/three/four/five/six/
seven/eigh\
t/nine/question/A/B/C/D/E/F/G/H/I/K/L/M/N/O/P/Q/R/S/T/U/V/W/X/Y/Z/
a/b/c/\
d/e/f/g/h/i/k/l/m/n/o/p/r/s/t/u/v/w/x/y/z/emdash)
/FontFile3 11 0 R
>>
endobj


PDF In-Depth Free Product Trials Ubiquitous PDF

Debenu Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link...

Download free demo

Debenu PDF Tools Pro

It's simple to use and will let you preview and edit PDF files, it's a Windows application that makes...

Download free demo

Back to the past, 15 years ago! Open Publish 2002

Looking back to 2002, it's amazing how much of the prediction became a reality. Take a read and see what you think!

September 14, 2017
Platinum Sponsor





Search Planet PDF
more searching options...
Planet PDF Newsletter
Most Popular Articles
Featured Product

Debenu PDF Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link control. Take Acrobat to the next level.

Features

Adding a PDF Stamp Comment

OK, so you want to stamp your document. Maybe you need to give reviewers some advice about the document's status or sensitivity. This tip from author Ted Padova demonstrates how to add stamps with the Stamp Tool along with related comments.