What is Machine Translation?
The term Machine Translation (MT) refers to a process of using computer software to provide automated translation from one language to another.
The history of MT dates back to July 1949 when researchers at the Rockefeller Foundation put forward a proposal based on information theory the successes of code breaking during the Second World War and fueled by Cold War fears. Claims were made that computers would replace human translations within 5 years. But solving the automated language translation challenge was a bigger challenge that was understood at the time… and many still believe this today.
Machine Translation Quality
Automated Translation was often perceived as a low quality based on outdated perception created by older translation technologies or freely available generic translation tools that have not been customized for a specific purpose. Many technological advances have been made in recent years that are slowly changing this perception. Recent advances in Machine Learning and its application to Machine Translation has contributed to astonishing translation results. However, similar to human translators, machines need domain expertise to provide high-quality output, thus making customization a requirement. Customized engines are the forte of Omniscien Technologies. Over the years we have delivered more than thousands of custom MT engines to our clients who used Language Studio for near-human quality translations.
Many technical approaches have been developed to solve the challenge of automated language translation. Well-known approaches are Rules Based Machine Translation (RBMT), Statistical Machine Translation (SMT), Syntax Based Machine Translation (SBMT), and most recently, Neural (NMT) and Deep Neural Machine Translation (Deep NMT). All of these approaches have their strengths and weaknesses. However, the latest innovations are a hybrid machine translation model combining all these techniques to deliver greater quality and functionality.
Statistical Machine Translation (SMT)
SMT has been the dominant approach for several years, with many previous users of Rules-Based Machine Translation (RBMT) technology replacing their legacy deployments with SMT based solutions. The SMT approach has been progressively refined, but still has many challenges. One of the most common issues reported by users is that terminology and writing style is inconsistent and unpredictable. Omniscien Technologies identified these issues and in 2006 pioneered a new approach to SMT called “Clean Data SMT” that addresses and resolves a large number of the issues that are inherent in the traditional “Dirty Data SMT” approach in which training data is mixed irrespective of quality and domain under the premise that the good data will go up to the top which is a common misconception. To learn more about SMT read our article “What is SMT?”
Neural Machine Translation (NMT) and Deep Neural Machine Translation (Deep NMT)
The latest technologies are Neural (NMT) and Deep Neural Machine Translation (Deep NMT). Both approaches are based on Machine Learning and Artificial Intelligence (AI) and use a large neural network. Deep NMT is an extension of NMT and differs from NMT in that Deep NMT processes multiple neural network layers instead of just one. While NMT and Deep NMT are a sizable step forward in translation quality and fluency as well as bring some notable benefits over other translation technologies, they also have some inherent limitations. Omniscien Technologies has combined both SMT and Deep NMT in the release of its latest Language Studio platform, resulting in a unique hybrid Deep NMT, SMT, Syntax and Rules-based solution for unprecedented translation quality and control.
For more information about NMT and Deep NMT read our blog: “The State of Neural Machine Translation (NMT) by Philipp Koehn” and “Riding the Machine Translation Hype Cycle – From SMT to NMT to Deep NMT by Dion Wiggins” or for a more extensive reading see the publication “Neural Machine Translation by Philipp Koehn”, Center for Speech and Language Processing, Department of Computer Science, Johns Hopkins University.
Today automated translation can solve a huge number of translation challenges and increase the productivity of translations while making content that previously could not be translated due to cost, time, and size more accessible.
In all cases, the quality of machine translation will always be better when a custom engine is used that has been focused on a specific domain.
How can Machine Translation be used?
Machine Translation is a powerful tool that has many purposes and can be used in a number of different ways:
- Fast translation to get the gist of the content
- Increase productivity of human translators by providing a first pass translation that can be quickly edited to human translation quality
- Instant translation of content such as chat, email or customer support communications
- Translating large volumes of data which would take too much time or would be too expensive to be translated by using a human only approach
- Translation of content where the business case is not strong enough to warrant the investment of human translation
and many more.
To find out more about how MT can increase productivity gains in various industries such as Media, E-Commerce, Intellectual Property, etc. see our Case Studies.
Generic and Custom Machine Translation
When a Machine Translation system has not been customized and is not specialized in a specific domain, this is commonly referred to as “Generic” MT. They can often be useful for getting a general idea from a piece of text, but is prone to grammar, syntax and other errors. Generic MT is not suitable for republishing content without considerable amounts of human editing. With the advances brought about by Neural MT and Deep NMT engines, the results even from Generic Engines have become notably better than the preceding SMT technologies.
However, Custom Machine Translation, when performed correctly, will deliver a notably higher quality translation output than Generic MT. Custom MT is the adaptation of a machine translation system to be specialized towards a specific domain or topic. A tailored Custom MT engine is optimized for a specific purpose. If trained with high-quality data that is domain-specific it will deliver much better results than a generic engine can produce.
- Introduction to Machine Translation at Omniscien
- Hybrid Neural and Statistical Machine Translation
- Custom Machine Translation Engines
- Powerful Tools for Data Creation, Preparation, and Analysis
- Clean Data Machine Translation
- Industry Domains
- Ways to Translate
- Supported Languages
- Supported Document Formats
- Deployment Models
- Data Security & Privacy
- Secure by Design
- Localization Glossary - Terms that you should know
- Language Studio
Enterprise-class private and secure machine translation.
- Media Studio
Media and Subtitle Translation, Project Management and Data Processing
- Workflow Studio
Enterprise-class private and secure machine translation.
- FAQ Home Page
- Primer; Different Types of Machine Translation
- What is Custom Machine Translation?
- What is Generic Machine Translation?
- What is Do-It-Yourself (DIY) Machine Translation?
- What is Hybrid Machine Translation?
- What is Neural Machine Translation?
- What is Statistical Machine Translation?
- What is Rules-Based Machine Translation?
- What is Syntax-Based Machine Translation?
- What is the difference between “Clean Data MT” and “Dirty Data MT”?