Introduction to Machine Translation at Omniscien
At the heart of all of Omniscien’s tools and products is state-of-the-art machine translation technology. Developed and refined for over 14 years, the Omniscien team has continually strived to innovate so as to deliver the highest quality translation output and most flexible machine translation platform.
Language Studio provides specialized machine translation engines for use in Media Studio and Workflow Studio that are optimized for a specific purpose such as subtitle translation. The complexity and power of these purpose-optimized workflows and machine translation engines are hidden away with easy use tools and user interfaces.
While machine translation is progressively improving, the Omniscien team recognizes that machine translation is not a replacement for professional human translations. Our technology is designed to augment human intelligence and increase their productivity by using a variety of technologies together. Our pre- and post-processing, and workflow technologies that are optimized for specific tasks and document formats further increase productivity, enforce styles and customer rules, and automate time-consuming tasks. This allows expensive human resources to focus on higher-value tasks while processing more content each day without any compromise on quality.
Modern Neural Machine Translation (NMT) generally provides more fluent and natural-sounding translations than the legacy Statistical Machine Translation (SMT) technologies. However, when content is out of domain or very short, SMT has been proven to provide more accurate translation results. For this reason, we created a hybrid machine translation technology approach that takes advantage of the best features of both SMT and NMT technologies.
Omniscien has learned what it takes to create customized machine translation engines that produce superior translation quality. We have built powerful tools that are proven to deliver exceptional results and make the process easier than ever before, even when you do not have any data of your own.
Different business use cases require access to translation technologies in different ways. Language Studio offers a wide variety of ways that users and applications can translate. This ranges from REST API to workflow tools, games, chat, email, embedded in applications, and in web portals.
Available as three Platform Editions specifically designed to match different business needs.
- Product Overview
- Machine Translation
- Custom MT Engines
- Industry Domains
- Data Creation Tools
- Clean Data MT Approach
- Ways to Translate
- Hybrid NMT/SMT Engines
- Detailed Features
- Supported Languages
- Document Formats
- Deployment Models
- Data Security & Privacy
- Secure by Design
- Secure Cloud
Hosted by Omniscien
- Enterprise Translation Server
On-premises or private cloud
- Data Center Platform
Key Features and Benefits
Hybrid Statistical and Deep Neural Machine Translation
Scriptable Translation Workflow Rules
600+ Language Pairs
If we don’t yet support your language pair, talk to us about adding it for you.
Off-the-Shelf Industry MT Engines
Hundreds of pre-built industry domain engines machine translation covering 14 industry domains: Automotive, Banking and Finance, eDiscovery, Engineering and Manufacturing, General Purpose, Information Technology, Life Sciences, Military, Intelligence and Defence, News and Media, Patents and Legal, Politics and Government, Retail and eCommerce, Subtitles and Dialog, and Travel and Hospitality.
Multiple Domains & Genres in a Single MT Engine
Translation Confidence Scoring and Quality Estimates
Custom Machine Translation Engines
Advanced Data Mining, Data Synthesis, Data Creation and Data Preparation Tools
Data is the fuel that powers high-quality custom MT engines. Few organizations have sufficient data to really make a difference in translation quality and domain context. Language Studio provides a range of powerful bilingual glossary and sentence creation tools. Automated website mining, data synthesis, and content cleaning, matching, and automation is used to create millions of bilingual sentences to quickly build your custom MT engine.
Multiple Domains & Genres in a Single MT Engine
Language Studio custom MT engines have a unique feature for customized engines that allows the translation genre, domain, and writing style to be specified at the time of translation (i.e. marketing, technical manuals). The resulting translation to be stylized to match. A full range of genres, domains, and styles can be built into a single machine translation engine.
Professional Guided Custom MT Training
A Professional Guided Custom MT Engine delivers the highest possible translation quality. Experts who have participated in customizing thousands of machine translation engines will engage with you to understand your goals and then will design a personalized Engine Customization Plan.
The entire process is project managed by the Omniscien team. Only minimal input from the customer in required. Customization takes a little longer than a Standard Custom MT Engine as the customization is a semi-automated process that includes detailed deep data analysis, data synthesis, bilingual data mining, and many other features that fine-tune the data and the engine to meet your business use case.
Do-It-Yourself Standard MT Training
A Standard Custom MT Engine is a simple way to quickly get a basic level of customization if you are in a hurry or just want to simply add your data to Omniscien Industry Domain Data for a rapid customization.
Simply select the Industry Domain Data that you want to include in your engine, upload your translation memories and glossaries and then submit to train. Processing is fully-automated and a custom engine will be produced in as little as just a few hours.
Anyone Can Create a Custom MT Engine... but....
There are many machine translation providers and even open-source tools that make it easy to create a custom machine translation engine. Some make it as simple as uploading a file and clicking the “train” button. We jokingly call this do-it-yourself approach “Upload and pray”. This approach will certainly give you an MT engine but even with a lot of data, the resulting engine will likely disappoint.
Anyone can create a custom MT engine. But it takes experience and expertise to create a great custom MT engine that is optimized for purpose. Much like in a kitchen, just throwing ingredients in a pot does not make a great meal. A good recipe, a skilled chief, and the right tools will give a much better result.
The key to creating a high-quality machine translation engine is a combination of the data that it is built on and pre-/post-processing that is optimized for your specific purpose. The video in the right-hand bar provides a good overview of what is involved.
With over 14 years of experience creating tens-of-thousands of custom machine translation engines, the Omniscien team have refined a methodology and set of tools that provide unmatched translation quality. Many of the tools and processes are unique to Omniscien and are designed to make it fast and easy to create your own custom machine translation engines.
Language studio offers custom MT engine in 2 modes. The Standard Custom MT Engine option provides a do-it-yourself approach that allows you to upload your data and simply click the “Train” button. However, while you can use this approach to customize your own MT engine, it is more suited to updating and existing engine that has been professionally customized. We recommend that the initial customization be performed using Language Studio’s Professional Guided Custom MT Engine option to ensure the highest possible quality.
A professional guided customization offers the following benefits and tools:
- An expert professional from Omniscien that will guide you through the process, minimizing your effort and ensuring that you get the highest quality custom machine translation engine possible.
- A deep analysis of the data that you have available for training.
- Expert selection of the best industry data to merge with your data.
- Automated glossary extraction and creation tools.
- Automated data synthesis to create synthetic data to complement your data.
- Data manufacturing that automatically finds similar bilingual data.
- Writing style and terminology control.
- Metric conversion (i.e. inches to centimeters, miles to kilometers, Fahrenheit to centigrade, etc.)
- Document matching and alignment.
- Sentence extraction and alignment.
- Comprehensive data cleaning.
- Pre-/Post-processing workflow optimization.
Language Studio developer tools provide a comprehensive set of tools to create, assemble, management, measure and optimize custom MT engines.
Watch the Omniscien Symposium Conference Video – “Secrets to Customizing a High-Quality Neural Machine Translation Engine”
When do you need a custom machine translation engine?
Generic machine translation systems such as Google have the goal of translating anything, anytime, anywhere, for anyone so that they can understand information. Such engines work reasonably well on the most common of text but fail and produce of context, lower quality, and often comical or embarrassing wrong translations when there is a cross-domain terminology conflict.
Consider the following examples:
“I went to the bank.”
There is insufficient context to determine if someone went to the ATM (Banking and Finance domain), banked their car into a turn (Automotive domain), or swam to the bank of a river.
I went to the bank of the river.
I went to the bank to get some cash.
I drove around the corner and I went to the bank of the turn.
“You have caught the virus.”
There is insufficient context to determine if someone is sick (Life Sciences / Medical domain) or if their computer has been infectected with a software virus (Information Technology domain).
I just got your COVID-19 test results back and you have caught the virus.
You have caught the virus from an infected file that you downloaded.
Without context, the most likely translation will be the most common form. This may not match the context of the text being translated. Industry domain machine translation engines offer a middle ground between generic machine translation systems and custom machine translation engines by providing the context of many useful domains.
Depending on the level of control over writing style and vocabulary that is required for your business use case, an off-the-shelf industry domain engine may be sufficient for many requirements. When a greater level of control is needed then a custom machine translation engine should be considered.
Off-the-Shelf Industry Domain MT Engines
Translation quality is improved with context. Pre-built industry engines can be used off-of-the-shelf to translate immediately or the data can be used as a base to build your own custom MT engines.
Industry domains include Automotive, Banking and Finance, eDiscovery, Engineering and Manufacturing, General Purpose, Information Technology, Life Sciences, Military, Intelligence and Defence, News and Media, Patents and Legal, Politics and Government, Retail and eCommerce, Subtitles and Dialog, and Travel and Hospitality.
Custom Machine Translation Engines
While industry domains will help with context, the writing style will also vary within an industry domain. For example, in the context of automotive, marketing content would have a very different writing style to technical engineering manuals.
Custom machine translation engines deliver notably higher translation quality than generic or industry domain translation engines. Runtime customization features such as that provided by glossaries provide limited control. When more control of writing style and vocabulary is needed a custom machine translation engine built on data that is matched to the purpose that the translations will be used for will deliver the highest quality and require the least amount of human effort in order to publish.
Simply upload your existing translations and click “Train” for a Standard Customization or work with the Omniscien team for a Guided Professional Custom MT Engine where we will synthesize, gather and create data for you.