What is custom Machine Translation?
Custom Machine Translation is the adaptation of a machine translation system to be specialized towards a specific domain or topic.
Rules Based Machine Translation (RBMT), Statistical Machine Translation (SMT), Neural Machine Translation (NMT) and Deep Neural Machine Translation (Deep NMT) systems can be customized. However, RBMT customization involves working a lot more with dictionaries and glossaries whereas SMT, NMT and Deep NMT customization involves the gathering and preparation of bilingual and monolingual data that are used for machine learning. Some Hybrid RBMT systems will also offer some statistical customization, but this is typically a statistical smoothing approach that attempts to repair a lower quality translation.
SMT, NMT and Deep NMT systems that are customized are typically customized using either the Dirty Data model or the Clean Data model. It is much easier to build a custom machine translation engine with the Dirty Data model than the Clean Data model as the Dirty Data model simply mixes data together with little human cognition, understanding or skill required. The resulting custom engine is typically unpredictable and lower quality than a fully customized machine translation engine using the Clean Data model. While a custom engine built on the Dirty Data model may be higher quality than generic machine translation output, it will still have many issues that limit its practical use. There are many machine translation vendors offering “me too” machine translation solutions based on the Dirty Data model where data can be uploaded for customizing, but little real control is provided. Some of these vendors even claim to be using Clean Data model, but also mislead their customers when in reality they only do the most basic of cleaning and usually do not even comply with even one of the four mandatory criteria in the Clean Data model.
Custom machine translation, when performed correctly, will deliver notably higher quality translation output than generic machine translation. However, customizing requires some effort and skills and should not be underestimated. It can be a complex task to fully customize a machine translation engine and every customization will be different.
Omniscien Technologies offers our customers a full customization with expert Language Studio Linguist guidance throughout the process. This empowers our users to have complete control over the customization process and desired outcome while experts that have built thousands of successful custom engines execute the complex tasks and guide the process to deliver optimal results.
Related Links
Pages
- Introduction to Machine Translation at Omniscien
- Hybrid Neural and Statistical Machine Translation
- Custom Machine Translation Engines
- Powerful Tools for Data Creation, Preparation, and Analysis
- Clean Data Machine Translation
- Industry Domains
- Ways to Translate
- Supported Languages
- Supported Document Formats
- Deployment Models
- Data Security & Privacy
- Secure by Design
- Localization Glossary - Terms that you should know
Products
- Language Studio
Enterprise-class private and secure machine translation. - Media Studio
Media and Subtitle Translation, Project Management and Data Processing - Workflow Studio
Enterprise-class private and secure machine translation.
FAQ
- FAQ Home Page
- Primer; Different Types of Machine Translation
- What is Custom Machine Translation?
- What is Generic Machine Translation?
- What is Do-It-Yourself (DIY) Machine Translation?
- What is Hybrid Machine Translation?
- What is Neural Machine Translation?
- What is Statistical Machine Translation?
- What is Rules-Based Machine Translation?
- What is Syntax-Based Machine Translation?
- What is the difference between “Clean Data MT” and “Dirty Data MT”?