Previous | Next | (P-PDF) Paper to PDF
Topic: Re: Optimizing OCR Process - GUIDANCE PLEASE!!! (Via Email)
Conf: (P-PDF) Paper to PDF, Msg: 23871
From: dhgpj
Date: 7/12/2001 04:56 PM
>In the process of creating an imaging solution for my Center, I have hit =a HUGE snag...
....and a very common one, BTW.
>Approx. 20% of the documents I will be archiving are 'onion skins' which =I intend to photocopy.
Why photocopy? Why add yet another generation? Scan the onionskin!
>The problem that I'm running into is that many of the skins are 7 or 8 =generation, causing the
>letters to be fuzzy. This fuzz is failing to be recognized in the OCR =process!!!
No surprise there, believe me! 7 paper generations can kill OCR for =almost any document.
>Using Acrobat Capture 3.01, I've tried exporting to PDF as Image & Text, =Searchable Image
>Exact, as well as Searchable Image Compact. The best result, with two =paragraphs identified as
>text, was with the Image & Text option; but this is by know means =acceptable.
What is acceptable? Unrealistic expectations are by far the largest =source of "problems" with OCR applications.
>What can be done on both software & hardware ends to optimize the OCR =process??
Your documents may be simply too poor in quality to get any meaningful OCR =results no matter what you try. There are other OCR products that are =somewhat more document-quality tolerant than Capture. (I am not in the =habit of making specific reccomendations - every case is different). With =the documents you describe, however, your uncorrected accuracy is going to =be lousy no matter what.
Duff Johnson
Document Solutions, Inc.
www.document-solutions.com