SimulTrans Localization Blog: SimulTips

How raw is raw? The need for post-editing

[fa icon="calendar"] January 25, 2016 / by Margarita Núñez

 How Raw is Raw_ The Need for Post-Editing.png

Your translation projects and needs certainly have been made easier with the advent of machine translation (MT). This technology gives you quick and good-value translations for the documents or files you are working with for your projects. Yet the question regarding whether to do post-editing or not after the documents have been MT'ed often comes up. Let's explore the need for post-editing raw output from machine translation.

First things first

What really is post-editing? Post-editing is the correction of machine translated text to ensure that it meets the "agreed level" of publishable quality requested by the client. Post-editing is not the same as editing. Editing is the process of correcting human generated text.

In localization we also talk about ‘light post-editing’ which means using human linguists to lightly post-edit the MT'ed output, simply to make it understandable. On the other hand, a ‘full post-edit’ means that linguists go the extra mile to make it not only understandable by the end user but also stylistically and industry appropriate.

 

Raw Output

The translated text that comes out of a customized MT system, namely raw output, typically has a combination of two things. Segments from an existing translation memory (TM) that have been leveraged against the source (if you are lucky enough to have a previous TM for that language combination) and the raw translated text that comes from the MT engine itself. However, if the engine is not customized, for instance a non-domain specific MT engine (like commercial engines), then the raw output might be composed solely of large bilingual data.

The resulting MT'ed content is a combination of both, a very literal translation of the documents from a MT system mixed up with correct translations from a TM system. This means that local colloquialisms, common idioms and some domain specific terminology can get lost! This necessitates the need for post-editing of the raw output by native post-editors to achieve a publishable quality.

 

You betcha

Imagine you need to convey expressions such as hey_yall.jpg‘youbetcha’ or ‘how's it going?’ as part of your project. These terms will most likely not be applied correctly, or if they are, the other words around them will be in the wrong place within the segment. This is because MT raw output may not interpret the meaning behind the expression and give a direct and very raw translation that misses the sentiment; the subtext, if you will, of the source segment.

This is where the need for post-editing using human linguists arises. There is no amount of rules and DNT (do not translate) glossaries that will fix certain issues. The use of human post-editors allows you to correct any mistranslations and correct the overall message by ensuring the subtext of the message is accurate and within the local lexicon and domain you are translating into.

This step often requires the use of post-editors that have the language skills, the subject matter expertise and the product knowledge so when you use nuances and colloquial terms in the source language, they don't get lost in translation.

 

Sblue.jpgeeing only BLEU

BLEU (Bilingual Evaluation Understudy) measures how many words overlap in a given translation when compared to a reference translation and gives a higher score to sequential words. Although the BLEU score from a MT engine is a good indicator for the engine’s performance it comes with certain limitations. For example, when the word order in the reference translation is seriously altered for stylistic purposes, the BLEU score is heavily penalised, even though the MT output might be of satisfactory quality. Therefore, basing the decision whether to post-edit a project or not, using only the BLEU score of the MT engine, is not always advisable.

 

Two for the price of one

Post-editing can also be useful for the training of future MT projects. For instance, we can instruct post-editors to give structured feedback on common MT errors while post-editing a project.  So the MT system can be improved over time with this type of feedback.

As obvious as it sounds, the need for a post-editing step after MT should always be considered, especially if you desire human translated quality content at the end of the process.

Discover more about MT by reading on here.

 

Is your project suitable for Machine Translation? Click on the button below for a free suitability report:

Get Your   Machine Translation   Suitability Report

 

Topics: Documentation Translation, Localization Technology

Margarita Núñez

Written by Margarita Núñez

Margarita serves as SimulTrans’ Director responsible for European sales, overseeing a team of account managers who build and maintain relationships with customers. She travels frequently throughout Europe, advising clients on best practices in the industry and helping them successfully localize their products for a global market. Margarita has spent over 20 years in the localization industry.