News

PDFlib TET 4 product family released

August 05, 2010

Advertisement
Advertisement
 

PDFlib has announced the release of the new version of its PDF content extraction engine. The latest edition improves page content analysis, supports right-to-left languages like Arabic and Hebrew, and offers advanced Unicode post-processing controls.

The updates to the engine have been implemented in the PDFlib TET (Text Extraction Toolkit) family of products: PDFlib TET 4, PDFlib TET PDF IFilter 4, and TET Plugin 4. The results of PDF text extraction have been enhanced with improved shadow removal, word boundary detection and de-hyphenation, along with superscript and subscript detection. More workarounds for non-conforming PDF documents improve the robustness of text extraction; the enhanced repair mode can successfully extract text from damaged PDFs.

TET 4 rearranges bidirectional text in Arabic or Hebrew documents to the proper logical order. Unicode post-processing controls offer folding, decomposition and normalization according to the Unicode standard which is useful to adjust the extracted text according to the requirements of the application.

TET is also available as a free plugin for Adobe Acrobat. The plugin supports Unicode syntax for search text and can highlight search hits on a page. Additionally, PDFlib also offers what it calls "the TET Cookbook", a collection of programming examples that demonstrate the use of TET for text and image extraction tasks.

For more information about the product, check out the official vendor website.

PDF In-Depth Free Product Trials Ubiquitous PDF

Debenu Quick PDF Library

Get products to market faster with this amazing PDF developer SDK. Over 900 functions and an equally...

Download free demo

Back to the past, 15 years ago! Open Publish 2002

Looking back to 2002, it's amazing how much of the prediction became a reality. Take a read and see what you think!

September 14, 2017
Platinum Sponsor





Search Planet PDF
more searching options...
Planet PDF Newsletter
Most Popular Articles
Featured Product

Debenu PDF Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link control. Take Acrobat to the next level.

Features

Adding a PDF Stamp Comment

OK, so you want to stamp your document. Maybe you need to give reviewers some advice about the document's status or sensitivity. This tip from author Ted Padova demonstrates how to add stamps with the Stamp Tool along with related comments.