Rules Based Machine Translation (RBMT) systems were the first commercial machine translation systems and are based on linguistic rules that allow the words to be put in different places and to have different meaning depending on context. RBMT technology applies to large collections of linguistic rules in three different phases: analysis, transfer and generation. Rules are developed by human language experts and programmers who have deployed extensive efforts to understand and map the rules between two languages. RBMT relies on manually built translation lexicons, some of which can be edited and refined by users to improve translation.
RBMT allows for some control via lexicon and user dictionary refinements and is relatively predictable in its output. However, these refinements can be very time consuming to implement and maintain and in some cases can lead to lower quality translations due to ambiguity of terms. As translations are provided by rules, often the ouput can read more “machine like” in writing style and although translations can be understandable, they are often not fluent. This is frequently referred to as “gist” quality, where the substance of the translation is reasonably clear, but considerably post-editing work is required to adapt the translation output to a specific target audience and writing style.
RBMT developers have recently attempted to address some of the limitations of RBMT by augmenting their core RBMT technology with some of the techniques of Statistical Machine Translation (SMT) and have been marketing their products as a Hybrid MT model. There are a number of different “Hybrid MT” models and each should be understood for their benefits and limitations.