SimulTrans

your languages - your timeline

 
Optical Character Recognition of Scanned PDFs

I am a member of the Society of Technical Communications Management Special Interest Group.  This group has an email forum where questions are often circulated about various topics.  One that came up today asked about the best way for processing scanned PDF files.

SimulTrans occasionally receives scanned PDFs from clients who have lost their initial source files or who have inherited materials through an acquisition or distributor relationship.  While it is never optimal to embark upon a translation project without source files in their native format, occasionally these scanned PDFs must be used.

Responding to the person who asked about the best way to prepare scanned PDF files for translation, I wrote the following reply, which I thought more people may find helpful:

At SimulTrans, we often need to OCR scanned PDFs for translation.

While we have used Adobe Acrobat Professional's OCR capabilities, we have found the ABBYY FineReader application to be even more effective for this purpose.  We like the ABBYY product because it allows us to define zones in each PDF for conversion (separating text blocks from graphics and defining how columns should be read), offers recognition in many languages, and provides cleaner output in a variety of formats.  The OCR results are not perfect, but better than we have seen with other tools.

It looks like they offer a free trial (http://finereader.abbyy.com/), so your team could run some tests to see if this product might be a good fit for their needs.

 

Add comment


Management Blog Overview

The Management Blog is composed by SimulTrans senior managers who write about interesting questions from clients, topics of conversation during internal meetings, or other insights about localization that we find fascinating.  

Adam Jones, SimulTrans' COO, is the primary author, joined by colleagues who contribute their insights from time to time.

Please take advantage of the comment feature to let us know what you think and to contribute your own ideas.

Share this Information

SimulTrans Blog