New Forum | Previous | Next | (P-PDF) Developers
Topic: Best way to redact / remove data from a PDF via JS
Conf: (P-PDF) Developers, Msg: 51998
Date: 5/29/2002 04:49 PM
I am trying to develop a tool to remove/redact/overwrite information in a PDF.
What I have is a 2-Dimensional barcode (2DBC) on the corner of a document (actually on the corner every page of a 100+K page document). The barcode is made up of 24 rows of 24 0's and 1's. 0 indicates white, 1 indicates fill black. The 2DBC is used to drive intelligent document processing equipment.
What I need to do is to either redact the 2DBC (including all underlying data (the 1's and 0'), or convert the 2DBC from font/text oriented information to a bitmap, and remove the underlying data.
The 2DBC always resides within a finite rectangle on the page that does not change from page to page or even PDF to PDF.
I have tried to place a box annotation, with white background and white stroke (covering the 2DBC) then flattening the pages, but this process adds 50% to the size of the PDF. When using Audit Space Usage - most this added space is in "Unknown".
I have tried APSaveAs (thanks LeonardR) - this gets me about 3.5% back, helpful, but not quite there.
Why do I need to do this? Part of my business process is to make copies of the PostScript printstream, turn it into a PDF, Catalog an index of the PDF, burn the whole blob on a CD-ROM and send it to my clients. All the extra 1's and 0's in tht 2DBC cause a "overflow" type error* in Catalog which makes the entire index unusable. If I turn off the indexing of numbers, then they cannot search on things like SSN (SIN for those in the Great White North), or other numeric fields.
If I wrote a 2DBC stripper to pull it out of the PostScript code, I would have to develop it for each generating application.
Changing the 2DBC font wouldn't work, as the underlying 1's and 0's (giving me the problem) would still be there.
* Adobe development is well aware of this "bug" (I have even gotten so far as to find the lead developer of Catalog for Adobe. They have confirmed this bug during the Beta (bug #9531), but that I did not get it to them in time for the 5.0 rollout (I first reported it to them in Dec '99 - seems they don't read their Customer Service problem tickets (case #299-3323 & 299-4450).
Sorry if I am being long winded - I've been working on this issue for almost 2 years now - and almost see a solution.
Thanks in advance for your time.
Douglas T. Hanna