Topic: zlib and reading a XREF contained within a stream object
Conf: (P-PDF) Beginners, Msg: 130319
From: dwightkr
Date: 3/31/2005 12:29 PM

I'm attempting to decompress an stream object that contains the PDF file's XREF table ... in this case the XREF table is contained in a XREF stream. I have COPY/PASTED the XREF object as defined in the PDF below:

109218 0 obj<<
/Size 109993
/Root 109200 0 R

/Length 138
/Columns 5
/Predictor 12
/W[1 3 1]
/Prev 9126253
/Info 109198 0 R
/Index[109199 794]

When I attempt to decompress the stream object with code from the public domain ZLIB (version 1.2.1), the call to uncompress() succeeds with a return code of 0, but the resulting decompressed stream does not make sense. I believe it does name make sense because if I use the W values of [1 3 1] to decipher the resulting decompressed buffer, the subseqent triplets consisting of [type, object number, index/generation] do not make sense. I assume I need to start parsing the decompressed stream starting at offset 0.

The first few bytes of the decompressed stream are:

02 01 00 00 10 00 02 00 00 3c 65 00 02 00 00 01 d9 00 02 00

I am able to successfully view the PDF with the Adobe reader, so I am confident it is not corrupt.

The specific PDF file that I am attempting to parse is the Adobe PDF Reference V1.6 document from the Adobe web site.

I suspect that I need to somehow specify values for /COLUMNS 5 and /PREDICTOR 12 during the ZLIB decompression. When using the public domain zlib decompression API uncompress(), I don't see how I can specify values for /columns 5 and /predictor 12 during decompression ... according to the Adobe PDF Reference V1.6 specification, a predictor of 12 suggests it using PNG UP encoding, but I am assume that this information is encoded into the original compressed stream.

I have been able to successfully decompress other FlateDecode streams in this file, but this is the only one which specifies /columns and /predictor.

Do I need to specify the predictor and number of columns while decompressing a stream, and, if so, does the pubic version of ZLIB support this?


