Translation and Localization Resources | SimulTrans

3 Strategies to Translate DITA Content Written in the Heretto CCMS

Written by the SimulTrans Team | August 19, 2021

 

Companies are transitioning to the Heretto Component Content Management System, a CMS that is becoming more popular. Based on DITA, this solution allows technical documentation authors to write and reuse content across formats and purposes.

What is DITA?

The Darwin Information Typing Architecture (DITA) specification is an open standard defined and maintained by the Organization for the Advancement of Structured Information Standards (OASIS). Based in an XML format, content is structured in discrete topics and strung together through a map.

For example, one of SimulTrans’ customers makes cellular infrastructure products, such as the hardware for antenna towers and the software to manage them. They can write one topic about supported frequencies and include it in a planning guide, a hardware installation manual, and a software user guide. The same content can be searchable through an online knowledgebase, published in PDF documents, and repurposed for interactive training.

DITA content is usually structured into different topic types, including three that are most common:

  • Concept topics provide background and context through conceptual descriptions
  • Task topics outline procedures to follow
  • Reference topics include detailed specifications and facts, usually in lists and tables

How does DITA facilitate translation?

Documentation developed in DITA costs less to translate for two reasons:

  1. Content can be more easily reused, allowing translations to be better leveraged through the use of translation memory. Since identical source text is repurposed in multiple places, it need only be translated once. Traditional, non-DITA authoring methods provide more flexibility for authors to write similar pieces of content that do not maintain alignment, driving up translation costs.
  2. Documentation does not require manual formatting. Unlike files published in InDesign, FrameMaker, Word, or other page layout applications, text need not be formatted in each target language. You can output the tagged DITA content using templates into whatever online or print formats you require. Template settings can take care of selecting language-appropriate fonts and accommodating text expansion.

Heretto CCMS

SimulTrans works with several DITA authoring tools, including hybrid applications like FrameMaker. Most have similar output formats for localization, usually involving translating the XML-based DITA source directly or exporting it to XLIFF for translation.

Heretto offers three primary strategies for content translation. All rely on exporting content and importing translations using the convenient Heretto Localization Manager feature.

Source Package

The source package includes the DITA source files in XML format. SimulTrans finds this approach is usually the most effective because it provides all the content in a structure that offers more insight into the maps used to output the text.

SimulTrans uses custom translation management system filters to correctly parse the XML files and create target-language versions. This approach allows us to protect the XML tags and edit only the content.  Heretto’s output largely complies with the DITA 1.3 structure with a few custom tags.

After the translation, it is essential to confirm the integrity of the XML files, checking consistency with the source and the DITA schema.  These checks will ensure the import of the target-language files goes smoothly. This verification is best done outside of Heretto, as import errors can be a bit inscrutable in the Localization Manager.

In addition to the source DITA files, you need to translate any linked media which should be included in the export package. For example, images and training videos must be translated separately, versions created for each target language, and included in the package for reimport. Errors can result from filenames that contain extended characters—it is usually best to stick with the original file naming convention for the translated versions of media elements.

XLIFF Package

XLIFF is a very common format used for translation of a wide variety of content types. Based on a different XML schema, it provides source- and target-language pairs.

Heretto provides an XLIFF export, extracting the DITA content into bite-size segments that almost all translation tools can digest even without custom filters. This approach makes it easy for less technical translators to work on content quickly.

As with DITA XML, it is equally important to ensure that the XLIFF tags and file structure remain intact. Most tools protect them flawlessly. Since there are far fewer tag types in XLIFF files compared to DITA XML, less verification is usually required.

The primary drawback of an XLIFF approach is that the files often lose context, with Heretto exporting the content in a different order than the comprehensive DITA structure. Translators also lose the ability to reorganize items to better suit the target languages. For example, it may be appropriate to slightly change the order of text elements or combine them in the DITA map, an option that does not exist when working in XLIFF.

Supplemental media such as graphics are not exported with the XLIFF, so these must be translated separately and reintegrated into the CCMS in language versions for correct output.

XTM Integration

One translation management system, XTM, offers a connector to Heretto. This is the only official integration for localization. Through this utility, you can select and route content for translation directly in the Localization Manager, it creates an XTM project, and translators can begin working.

The connector was primarily designed for companies that use both Heretto and XTM internally. It works less efficiently for organizations that use Heretto and want to connect to their language service provider’s XTM instance, particularly where security concerns arise for providing external access to Heretto.

The biggest key to using the connector efficiently is designing XTM workflows to route the content for translation automatically. Companies with XTM subscriptions can connect their instances to their translation partners’ XTM instances, providing a conduit for translations to make a round trip from and back into Heretto.

Conclusion

All three strategies work well, and each has pros and cons as detailed above. Regardless of the technical solution you choose, the greatest keys to successful translation of Heretto content are maintaining your DITA structure appropriately, maximizing reuse of topics, and working with translators who are familiar with the schema. Handling the files correctly is just one attribute of this requirement; it is even more critical that translators understand the philosophy of topic-based authoring so they maintain the portability of each content element.