Anatomy of a Great Custom Machine Translation Engine
Anyone can create a custom MT engine by simply uploading data and training. However, much like in a kitchen, just throwing ingredients in a pot does not make a great meal. A good recipe, a skilled chief, and the right tools will give a much better result. Similarly, a great translation engine that consistently delivers substantially higher quality translations that generic machine translation engines such as Google, requires expertise and experience.
With over 14 years of experience creating tens-of-thousands of custom machine translation engines, the Omniscien team has refined a methodology and set of tools that provide unmatched translation quality. Many of the tools and processes are unique to Omniscien and are designed to make it fast and easy to create your own custom machine translation engines.
This webinar will go through each of the core processes that we follow and the tools that we use when we build a Professional Guided custom machine translation engine. They include data cleaning, data gathering, data synthesis, document alignment, sentence matching, quality measurement, and much more.
You will learn:
- When a custom machine translation is needed.
- How much data is needed for a custom machine translation engine (spoiler: millions of domain-specific sentences) and how to create that data when you do not have enough.
- What tools are used to gather and process data.
- How to control writing style and embed multiple styles and domains into a single machine translation engine.
- How to automatically create thousands of in-domain bilingual glossary terms.
- How to synthesize millions of bilingual sentences.
About Webinar Week
The Omniscien team has been hard at work developing our latest versions of market-leading machine translation, media and subtitle processing, and workflow automation products. Currently being finalized and in beta test ahead of their upcoming release, we are excited to announce a series of new products and product updates.
Join the Omniscien team for 3 presentations, including 2 new product preview webinars, and a comprehensive deep dive into the latest advances in data gathering, data synthesis, data cleaning, and best practices for customizing a high-quality translation engine.
Webinar Week Presentations