|
I am a member of the Society of Technical Communications Management Special Interest Group. This group has an email forum where questions are often circulated about various topics. One that came up today asked about the best way for processing scanned PDF files.
SimulTrans occasionally receives scanned PDFs from clients who have lost their initial source files or who have inherited materials through an acquisition or distributor relationship. While it is never optimal to embark upon a translation project without source files in their native format, occasionally these scanned PDFs must be used.
Responding to the person who asked about the best way to prepare scanned PDF files for translation, I wrote the following reply, which I thought more people may find helpful:
At SimulTrans, we often need to OCR scanned PDFs for translation.
While we have used Adobe Acrobat Professional's OCR capabilities, we have found the ABBYY FineReader application to be even more effective for this purpose. We like the ABBYY product because it allows us to define zones in each PDF for conversion (separating text blocks from graphics and defining how columns should be read), offers recognition in many languages, and provides cleaner output in a variety of formats. The OCR results are not perfect, but better than we have seen with other tools.
It looks like they offer a free trial (http://finereader.abbyy.com/), so your team could run some tests to see if this product might be a good fit for their needs.
|