Anatomy of a Great Custom Machine Translation Engine


Anyone can create a custom MT engine by simply uploading data and training. However, much like in a kitchen, just throwing ingredients in a pot does not make a great meal. A good recipe, a skilled chief, and the right tools will give a much better result. Similarly, a great translation engine that consistently delivers substantially higher quality translations that generic machine translation engines such as Google, requires expertise and experience.

With over 14 years of experience creating tens-of-thousands of custom machine translation engines, the Omniscien team has refined a methodology and set of tools that provide unmatched translation quality. Many of the tools and processes are unique to Omniscien and are designed to make it fast and easy to create your own custom machine translation engines.

This webinar will go through each of the core processes that we follow and the tools that we use when we build a Professional Guided custom machine translation engine. They include data cleaning, data gathering, data synthesis, document alignment, sentence matching, quality measurement, and much more.

You will learn:

  • When a custom machine translation is needed.
  • How much data is needed for a custom machine translation engine (spoiler: millions of domain-specific sentences) and how to create that data when you do not have enough.
  • What tools are used to gather and process data.
  • How to control writing style and embed multiple styles and domains into a single machine translation engine.
  • How to automatically create thousands of in-domain bilingual glossary terms.
  • How to synthesize millions of bilingual sentences.

