February 2, 2020 / by the SimulTrans Team Estimated read time: 6 minutes
Ingredients for a Successful Machine Translation Project
The most widely used types of Machine Translation (MT) are statistical MT and rule-based MT. The former is based on the calculation of translation probabilistic models while the latter on grammatical information and dictionaries. Nowadays, these types of MT have become available as commercial products and can be integrated into CAT tools, saving both time and money in your translation projects.
What to MT or, In Other Words, What to Cook?
Before engaging in an MT project, the first thing you would have to question is your textual domain's suitability. Make sure that your domain is fit for it. For the most part, texts with technical, laconic language and user-generated content yield the best results.
The second thing you need to investigate is your ROI from the use of MT. The answer on this depends on the MT training resources you have at your disposal and, of course, on the prospective volume of translations. If the answer to these questions is positive, here are some useful tips on how to build a good MT engine.
Essential Ingredients
Some basic ingredients for the creation of a customized MT are:
1. An MT system either in the form of a cloud-based platform (like KantanMT or MSTH) or a stand-alone piece of software (like the command-line tool, Moses)
2. Bilingual resources (translation memories (TMs), dictionaries, parallel corpora, term bases)
3. A lot of patience!
Tips for Your Perfect Recipe
Aside from these necessary ingredients, some useful tips to enhance your MT engine’s performance are the following:
- Use monolingual resources (monolingual corpora, Do-not-Translate (DNT) lists)
- Gather as much data as possible. As a rule of thumb, keep in mind that the more data you feed into your engine, the better it will perform.
- Not any data will do, though. The data you feed into your engine needs to be representative of the domain in question.
· Keep your data clear. Refrain from using colloquialisms, idioms, and slang.
· If possible, pre-edit your data to maximize results.
· Post-edit your MT output and re-feed it into the engine.
· If your domain allows, consider the use of a control language.
How to Serve?
Last but not least, always keep in mind that some post-editing might be required. This will be determined by the final aim of the translated text. Light post-editing might suffice for gisting purposes, while full post-editing would be required to attain publishable quality.
And for Dessert...
To sum up, there are many variables that need to be taken into account when planning the deployment of your customized engine. And the best way to know if machine translation technology is working for you is to always check that you are not spending more time on it than what would be required to translate from scratch.
Like to discover more about when to use Machine Translation instead of human translators? Get a complimentary Machine Translation suitability report:
Written by the SimulTrans Team
The SimulTrans team has been providing localization solutions for international businesses since 1984. Our team is a diverse, engaged, multinational group of industry-expert translators, reviewers, project managers, and localization engineers. Each team member is devoted to collaborating, locally and globally, to maintain and expand SimulTrans’ leadership in the language services sector.