Previous | Next | (P-PDF) PDF Accessibility
Topic: RE: pdf scraping (Via Email)
Conf: (P-PDF) PDF Accessibility, Msg: 137121
From: LeonardR
Date: 8/2/2005 03:46 AM
At 11:27 PM 8/1/2005, p-pdf-accessibility Listmanager wrote:
>Well I am working mostly with data tables (i.e. stock prices from
>such time to such time) and data lists, which I need to put into
>Excel spreadsheets and save them to a database where they can be
>reacessed and updated from time to time.
OK...
>if worst comes to worst I can write a program that downloads and
>converts pdfs to texts while scraping regular html files but this
>would be running 3-6 programs simultaneously (I would have to also
>convert the texts to xls' and also acess and save to the database),
Possibly, though there are also PDF->XLS converters as well
which would save some time.
> so obviously it would be much easier if I could find a scraper
> which works on pdfs as well as htmls.
Well, given that most Web Search Engines support PDF
automatically - it wouldn't surprise me if other related tools did as
well. Try a Google search.
Leonard
---------------------------------------------------------------------------
Leonard Rosenthol
Chief Technical Officer
PDF Sages, Inc. 215-938-7080 (voice)
215-938-0880 (fax)