Phrase Aligner

by Michalis Troullinos, November 2013, 44 pages.

FIMU-RS-2013-2. Available as Postscript, PDF.

Abstract:

The PRESEMT (Pattern REcognition-based Statistically Enhanced MT) project is intended to lead to a flexible and adaptable MT system, based on a language-independent method, whose principles ensure easy portability to new language pairs. This report describes the Phrase aligner module (PAM) of the PRESEMT system. PAM processes the bilingual corpora by performing text alignment at word and phrase level within a language pair. It operates in offline manner, processing the set of parallel sentences so as to determine how phrases are transformed from SL to TL.