New Forum | Previous | Next | (P-PDF) Developers
Topic: any Open Source PDF-to-Text API ?
Conf: (P-PDF) Developers, Msg: 100155
Date: 11/11/2003 10:54 AM
I am trying to write a small crawler that fetches a pdf document from internet and extract its text contents, for storing it on a database.
I can use command line tools like "pdftotext" to achieve it. But I would like to do it on memory using APIs instead of using command line tools everytime my crawler finds a pdf document.
Are there any Open source APIs that can be used extract text from pdf document? (other than hacking around some conversion tools source code files to achieve it)