Corpus-based Rules for Czech Verb Discontinuous Constituents

by Eva ®áèková, Karel Pala, This is an adapted version of the paper accepted for printing in the Proceedings of TSD`99. August 1999, 6 pages.

In this paper we present a method for extracting general structures of the verb groups from a tagged and fully disambiguated corpus and consecutive exploitation of these structures for the building a formal grammar in the Prolog DCG fashion. Our goal is to apply them as a rules for the analysis of the Czech verb groups in the non-disambiguated grammatically tagged Czech corpus texts. The problem of the recognition of verb discontinuous constituents in Czech is also approached and obtained statistical data are presented.

