News

PDFlib releases new text extraction package

July 06, 2005

Advertisement
Advertisement
 

The latest addition to the Munich-based PDFlib's suite of developer products is entitled PDFlib Text Extraction Toolkit. Also known as PDFlib TET, the software package is used to extract text from PDF documents, converting it to Unicode strings while preserving font and glyph information. The toolkit is currently available as a library, component and a command-line tool.

Suggested uses include the development of software for searching text, implementing a search engine to process large PDF archives, extracting text for storage or translation, converting PDF text into other formats, content-based processing of PDF documents (e.g. highlighting keywords) and comparing text between multiple PDF documents.

TET has been designed for standalone use, and does not require any third-party software to run effectively. Additionally, the product is robust enough for multi-threaded server use, significantly increasing its capacity. Language bindings including Windows, Macintosh and a several UNIX versions are available for use with various programming environments.

PDFlib TET is currently available for purchase and download.

Related Products at PDF Store

XpdfViewer

This ActiveX control (OCX) provides a PDF file viewer component, enabling developers to add PDF view... View full product details
Download free demo

PDFlib

A widely used programming library which allows the programmer to generate PDF and integrate this abi... View full product details
Download free demo

ARTS PDF Crackerjack

Impose pages, automate your workflow, verify certified PDFs, print accurate colour separations, conv... View full product details
Download free demo

PDF In-Depth Free Product Trials Ubiquitous PDF

Nitro PDF Professional

the perfect PDF product for business and enterprise, combining an extremely competitive price with a...

Download free demo

XpdfViewer

This ActiveX control (OCX) provides a PDF file viewer component, enabling developers to add PDF viewing...

Download free demo

Ubiquitous PDF: Multi-format publishing with PDF

PDF publishing is mainstream these days, and it's certainly more efficient and eco-friendly than conventional distribution channels. Sometimes, though, it's a pleasure to sit down with a printed magazine, and many readers will never enjoy reading long items digitally. Magcloud is a self-publishing service which combines e-publishing with print-on-demand.

September 09, 2010
Search Planet PDF
more searching options...







Download PDF Creator

Download The Best of Planet PDF volume 2
Planet PDF Newsletter
Most Popluar Articles
Features

Collating PDFs using JavaScript

Despite the numerous benefits, there can be potential issues with the conversion of paper documents into electronic archives. When scanning paper pages into PDF, it's possible to end up with the odd- and even-numbered pages in separate PDF files. It can be very time-consuming to collate them manually, but there is an easier way. Sean Stewart explains.

Featured Product

BCL easyPDF SDK

BCL easyPDF SDK is a set of PDF Programming Libraries designed specifically to help Software Developers / Programmers build and deploy enterprise class PDF applications for corporate wide PDF...

Platinum Sponsor
Create & Edit PDF - Nitro PDF Software

ARTS PDF

Silver Sponsors

PDF-Tools QuickPDF: The Unrivaled PDF Developer Toolkit