March 24, 2010

Scan processing with pdfimages and ImageMagick

Just for my own reference, the following procedure works quite well for turning greyish pdf scans into something printable:
pdfimages foo.pdf img
for a in *.ppm; do convert $a -level 50%,80%  ${a%ppm}png ; done
This uses pdfimages (on OSX you can grab the xpdf package from MacPorts) for extracting images from the pdf file, and then ImageMagick convert (probably MacPorts again) to set the white and black levels to 50 and 80 percent respectively. Without pdfimages, ImageMagick just calls ghostscript to convert the pdf at a fixed resolution which is of course useless for this purpose. Next time I might have a look at the contrast-stretch and auto-level options which might be a bit more robust.

No comments: