Planet PDF Forum Archive

Planet PDF ForumWowsers! This is page is old, head to the LIVE Planet PDF Forum. It features more than 10 conferences, covering everything from beginner to in-depth developer and pre-press discussions. If you wish to continue... one & two archive covers 1999-2011 (160,000 pages).

New Forum | Previous | Next | (P-PDF) Developers

Topic: Best way to redact / remove data from a PDF via JS
Conf: (P-PDF) Developers, Msg: 51998
From: dthanna
Date: 5/29/2002 04:49 PM

I am trying to develop a tool to remove/redact/overwrite information in a PDF.

What I have is a 2-Dimensional barcode (2DBC) on the corner of a document (actually on the corner every page of a 100+K page document). The barcode is made up of 24 rows of 24 0's and 1's. 0 indicates white, 1 indicates fill black. The 2DBC is used to drive intelligent document processing equipment.

What I need to do is to either redact the 2DBC (including all underlying data (the 1's and 0'), or convert the 2DBC from font/text oriented information to a bitmap, and remove the underlying data.

The 2DBC always resides within a finite rectangle on the page that does not change from page to page or even PDF to PDF.

I have tried to place a box annotation, with white background and white stroke (covering the 2DBC) then flattening the pages, but this process adds 50% to the size of the PDF. When using Audit Space Usage - most this added space is in "Unknown".

I have tried APSaveAs (thanks LeonardR) - this gets me about 3.5% back, helpful, but not quite there.

Why do I need to do this? Part of my business process is to make copies of the PostScript printstream, turn it into a PDF, Catalog an index of the PDF, burn the whole blob on a CD-ROM and send it to my clients. All the extra 1's and 0's in tht 2DBC cause a "overflow" type error* in Catalog which makes the entire index unusable. If I turn off the indexing of numbers, then they cannot search on things like SSN (SIN for those in the Great White North), or other numeric fields.

If I wrote a 2DBC stripper to pull it out of the PostScript code, I would have to develop it for each generating application.

Changing the 2DBC font wouldn't work, as the underlying 1's and 0's (giving me the problem) would still be there.

* Adobe development is well aware of this "bug" (I have even gotten so far as to find the lead developer of Catalog for Adobe. They have confirmed this bug during the Beta (bug #9531), but that I did not get it to them in time for the 5.0 rollout (I first reported it to them in Dec '99 - seems they don't read their Customer Service problem tickets (case #299-3323 & 299-4450).

Sorry if I am being long winded - I've been working on this issue for almost 2 years now - and almost see a solution.

Thanks in advance for your time.

Douglas T. Hanna
Hewitt Associates

PDF In-Depth Free Product Trials Ubiquitous PDF

Debenu Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link...

Download free demo

Debenu PDF Tools Pro

It's simple to use and will let you preview and edit PDF files, it's a Windows application that makes...

Download free demo

Five visions of a PDF Day

In the world of PDFs or as we like to say Planet (of) PDF, a year isn't a real PDF year without an intense few days of industry knowledge sharing.

May 15, 2018
Platinum Sponsor

Search Planet PDF
more searching options...
Planet PDF Newsletter
Most Popular Articles
Featured Product

Debenu PDF Aerialist

The ultimate plug-in for Adobe Acrobat. Advanced splitting, merging, stamping, bookmarking, and link control. Take Acrobat to the next level.


Adding a PDF Stamp Comment

OK, so you want to stamp your document. Maybe you need to give reviewers some advice about the document's status or sensitivity. This tip from author Ted Padova demonstrates how to add stamps with the Stamp Tool along with related comments.