If you are interested in attending our upcoming webinars, please click here to sign up.

What is Custom Machine Translation?

What is Custom Machine Translation?
omniscien

What is Custom Machine Translation?

Custom Machine Translation is the adaptation of a machine translation system to be specialized towards a specific domain or topic.

Rules Based Machine Translation (RBMT), Statistical Machine Translation (SMT), Neural Machine Translation (NMT) and Deep Neural Machine Translation (Deep NMT) systems can be customized. However, RBMT customization involves working a lot more with dictionaries and glossaries whereas SMT, NMT and Deep NMT customization involves the gathering and preparation of bilingual and monolingual data that are used for machine learning. Some Hybrid RBMT systems will also offer some statistical customization, but this is typically a statistical smoothing approach that attempts to repair a lower quality translation.

SMT, NMT and Deep NMT systems that are customized are typically customized using either the Dirty Data model or the Clean Data model. It is much easier to build a custom machine translation engine with the Dirty Data model than the Clean Data model as the Dirty Data model simply mixes data together with little human cognition, understanding or skill required. The resulting custom engine is typically unpredictable and lower quality than a fully customized machine translation engine using the Clean Data model. While a custom engine built on the Dirty Data model may be higher quality than generic machine translation output, it will still have many issues that limit its practical use. There are many machine translation vendors offering “me too” machine translation solutions based on the Dirty Data model where data can be uploaded for customizing, but little real control is provided. Some of these vendors even claim to be using Clean Data model, but also mislead their customers when in reality they only do the most basic of cleaning and usually do not even comply with even one of the four mandatory criteria in the Clean Data model.

Custom machine translation, when performed correctly, will deliver notably higher quality translation output than generic machine translation. However, customizing requires some effort and skills and should not be underestimated. It can be a complex task to fully customize a machine translation engine and every customization will be different.

Omniscien Technologies offers our customers a full customization with expert Language Studio Linguist guidance throughout the process. This empowers our users to have complete control over the customization process and desired outcome while experts that have built thousands of successful custom engines execute the complex tasks and guide the process to deliver optimal results.