Omniscien Technologies » FAQ » What is Syntax-Based Machine Translation (SBMT)?

What is Syntax Based Machine Translation (SBMT)?

The goal of Syntax-Based Machine Translation techniques is to incorporate an explicit representation of syntax into the statistical machine translation systems. Syntax-based translation is based on the idea of translating syntactic units, rather than single words or strings of words (as in phrase-based MT), i.e. (partial) parse trees of sentences/utterances.

The idea of syntax-based translation is quite old in MT, though its statistical counterpart did not take off until the advent of strong stochastic parsers in the 1990s. Examples of this approach include DOP-based MT and, more recently, synchronous context-free grammars. One of the challenges of the syntax-based approach is translation speed. Improvements in translation quality have been noted with the use of syntax-based translation, but the speed of translation is significantly less than other approaches.

Amr Ahmed and Greg Hanneman of the Language Technologies Institute at Carnegie Mellon University wrote a very good overview of Syntax-Based Statistical Machine Translation.