New Forum | Previous | Next | (P-PDF) Developers
Topic: Reading PDF files as binary files
Conf: (P-PDF) Developers, Msg: 82476
Date: 3/8/2003 01:18 AM
I am a Java developer and have a customer with an interesting problem. He used a tool to create his PDF files, but some of them were created with problems: they include more than one PDF file in a single file. Now he wants that I read the wrong PDF files, separate them as they should be and capture the information inside them (like client number, for instance) to correctly include them in a database (in a BLOB field, just as they are today). Of course, new records will be created, just as the PDF's are separated.
I don't know much about PDF file structure. My first approach to solve the problem was to read the wrong PDF's as binary files and, knowing how PDF's are structured, understand how to separate them and capture any information which is inside them.
My question is: is this the better approach? Is there any "off-the-shelf" tool that could help me? And, finally, if this is the only possible approach, were can I find information about PDF file structure?
Thanks in advance!