New Forum | Previous | Next | (P-PDF) Developers
Topic: Re: How to config the xpdfrc file in xpdf on windows, I can't extract the chinese content! (Via Email)
Conf: (P-PDF) Developers, Msg: 93057
Date: 7/28/2003 05:44 PM
> From: "HuataoSoft"
> I had download the chinese-simplified folder, the parent directory of chinese-simplified folder
> as same as pdftotext.exe. I had add the content of add-to-xpdfrc to xpdfrc:
> But when run the pdftotext.exe, Only English text can be extracted and chinese content is lost!
Did you select a suitable output encoding, i.e., one that can encode
Chinese characters? For example:
pdftotext -enc UTF-8 file.pdf
If you don't do this, pdftotext will (by default) generate Latin-1 (ISO
8859-1), and will drop any characters that don't have a Latin-1
encoding, including all of the Chinese text.
You can also set the default encoding in your xpdfrc file, like this: