Planet PDF Forum Archive

Planet PDF ForumThe page you are viewing is part of our 160,000 page PDF discussion forum archive spanning 1999-2008. Would you believe we have a 2nd forum archive which covers 2008 - 2011? But... if you really want to bust-a-move head to the LIVE Planet PDF Forum. It features more than 10 conferences, covering everything from beginner discussions to in-depth developer and pre-press discussions.


How to search this archive. The quickest way is to use the filters on our Advanced Search page so that only archive pages are included in the results.


Previous | Next | (P-PDF) Paper to PDF


Topic: Re: Optimizing OCR Process - GUIDANCE PLEASE!!! (Via Email)
Conf: (P-PDF) Paper to PDF, Msg: 23871
From: dhgpj
Date: 7/12/2001 04:56 PM

>In the process of creating an imaging solution for my Center, I have hit =a HUGE snag...

....and a very common one, BTW.

>Approx. 20% of the documents I will be archiving are 'onion skins' which =I intend to photocopy.

Why photocopy? Why add yet another generation? Scan the onionskin!

>The problem that I'm running into is that many of the skins are 7 or 8 =generation, causing the
>letters to be fuzzy. This fuzz is failing to be recognized in the OCR =process!!!

No surprise there, believe me! 7 paper generations can kill OCR for =almost any document.

>Using Acrobat Capture 3.01, I've tried exporting to PDF as Image & Text, =Searchable Image
>Exact, as well as Searchable Image Compact. The best result, with two =paragraphs identified as
>text, was with the Image & Text option; but this is by know means =acceptable.

What is acceptable? Unrealistic expectations are by far the largest =source of "problems" with OCR applications.

>What can be done on both software & hardware ends to optimize the OCR =process??

Your documents may be simply too poor in quality to get any meaningful OCR =results no matter what you try. There are other OCR products that are =somewhat more document-quality tolerant than Capture. (I am not in the =habit of making specific reccomendations - every case is different). With =the documents you describe, however, your uncorrected accuracy is going to =be lousy no matter what.

Duff Johnson
Document Solutions, Inc.
www.document-solutions.com


PDF In-Depth Free Product Trials Ubiquitous PDF

LockLizard Safeguard PDF Security

Made specifically for publishers of high value information published in PDF format, it protects your PDF...

Download free demo

ARTS PDF Aerialist X

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link...

Download free demo

Ubiquitous PDF: DIY PDF magazines, courtesy of CNET and Magazinify

Thanks to Magazinify.com, it's possible to have web articles delivered right to your inbox in PDF form. If that weren't enough, the nice folks at CNET have been nice enough to publish a step-by-step guide about how to set this all up using just a little time and a free Magazinify account.

September 06, 2011
Search Planet PDF
more searching options...
PDF Resources
Platinum Sponsor

ARTS PDF

Create & Edit PDF - Nitro PDF Software

Silver Sponsors

LockLizard DRM PDF Security Quick PDF Library: The Unrivaled PDF Developer Toolkit

Featured Product

ARTS PDF Crackerjack X

The most popular Acrobat plug-in for PDF-based color print production and automation.

Featured Event

Adobe Digital Marketing Summit

March 20-23, 2012 -- Salt Palace Convention Center, Salt Lake City, Utah

The Digital Marketing Summit is the premier event for digital marketers and advertisers to learn about and share key strategies for driving marketing innovation. Attend Summit to learn how you can create, measure, and optimize digital experiences to revolutionize how the world engages with ideas and information.

PDF Store Categories