"Stack" decoder

Instructions

To begin, click a source language word to create the empty hypothesis. Clicking on a source-language word also reveals a list of translation candidates for that word.

You then build new hypotheses by selecting an existing one and choosing a word to extend it with. Each chart hypothesis contains the sequence of translated words (implicitly, if dynamic programming is on), the model score, and the coverage vector (denoting which source-language words have been translated).

If a word is highlighted in red, that means it is not available to extend the selected hypothesis (or that no hypothesis is selected). This can be due to the distortion constraints or because you are trying to extend a hypothesis with a translation of a word that has already been covered by that hypothesis. When you highlight a hypothesis, the sequence of hypotheses that were used to produce it are highlighted.

The algorithm can be automated by clicking the automate button right when the page is loaded. Playing around with the distortion limits and stack size will demonstrate the tradeoff between speed (measured by the number of hypotheses popped) and accuracy (measured by the model score of the best translation at the end).

More information

Machine translation is the scientific endeavor that investigates automated methods for translating sentences from one human language to another. This page implements a simple word-based decoder for purposes of visualization.

Decoder is the jargon term for an algorithm that performs the translation task. Decoding algorithms typically assemble translation hypotheses by translating the words and phrases of the source sentence. The term "stack decoding" refers to the fact that hypotheses for the translation are maintained in stacks (which are actually priority queues). Each of these stacks groups together hypotheses that represent translations of the same number of words of the source language sentence. The hypotheses are sorted in decreasing order according to a score assigned by a model (which in this case is a language model and word-based translation model).

A good (non-free) introduction to machine translation is Philipp Koehn's book Statistical Machine Translation. A good (free) introduction is Adam Lopez's ACM article of the same title.