Natural language processing

Natural language processing, or NLP, covers a substantial part of the whole field of artificial intelligence - human-computer interaction, advanced information processing, automated reasoning, or machine translation between both human languages and logical formalisms as "computer" languages.

  1. Formal structure of natural language
  2. Knowledge representation and reasoning
  3. Pragmatic aspects of communication
  4. Machine translation

Formal structure of natural language

Annotation
The formal description of natural language is often based on the lingustic analysis levels (morphological, syntactic and semantic). The candidate will get acquainted with fundamental theoretical approaches and concepts in relation to particular levels of analysis (finite automata, formal grammars, logical formalisms, etc.) and also with the design of standard tools (morphological analysers, parsers).
Keywords
Basic units of morphological level, algorithmic description of (Czech) morphology, tools for morphological analysis (tagger, lemmatizer, guesser), formal grammars for natural language, rules and structures, grammar types (CFG, CCG, HPSG, LTAG, LFG), (verbal) valency frames (surface and deep), parsers and syntactic analyzers (statistical, rule-based, chart parsing, partial parsing/chunking).
Basic study material
Examiners
doc. RNDr. Aleš Horák, Ph.D., doc. Mgr. Pavel Rychlý, Ph.D.
Other recommended literature

Knowledge representation and reasoning

Annotation
Representation of meaning and inference of new knowledge are fundamental areas in the field of computer language processing. The candidate will get acquainted with lexical and logical semantics and with techniques and tools used to capture the meaning of words and phrases (definition of meaning, electronic dictionary systems, semantic networks, ontologies). Logical analysis of natural language sentences involves mastering logical formalisms and the tools that use them (normal translation algorithm).
Keywords
Lexical meaning of words and phrases (collocations), semantic networks, frames, ontologies, word sense disambiguation (WSD), word sense induction, semantic classification of verbs, compositionality, sense and reference, concept of definition and analysis, semantic representations of sentences based on predicate logic and intentional logic, semantic analysis algorithm - normal translation algorithm, logical and common sense inference (reasoning).
Basic study material
Examiners
doc. RNDr. Aleš Horák, Ph.D., doc. Mgr. Pavel Rychlý, Ph.D.
Other recommended literature
  • M. Duží et al. Procedural semantics for hyperintensional logic: Foundations and applications of Transparent intensional logic. Springer Science & Business Media, 2010.
  • Jurafsky and Martin, Speech and Language Processing (2023, 3rd ed. draft)
  • P. Materna, Conceptual Systems, LOGOS Verlag Berlin, 2004.

Pragmatic aspects of communication

Annotation
Pragmatic aspects of communication cover both theoretical and practical analysis of the communication situation. The candidate will get acquainted with external and internal pragmatics, discourse analysis and with techniques and tools used for the analysis of textual relationships (algorithms for recognizing anaphors, discourse segmentation). In addition, this topic includes dialogue management and analysis of emotional attitudes of the language user.
Keywords
Internal pragmatics, external pragmatics, communication situations, discourse analysis, anaphora, anaphoric relations and their recognition in the text, segments in discourse and their recognition, dialogue systems, sentiment analysis.
Basic study material
Examiners
doc. RNDr. Aleš Horák, Ph.D., doc. Mgr. Pavel Rychlý, Ph.D.
Other recommended literature

Machine translation

Annotation
The field of machine translation is an application area in which theoretical knowledge and tools acquired and developed in natural language processing as a whole are verified and tested. An automatic translation of phrases, sentences or documents includes text analysis/encoding and text generation/decoding. Any error in the whole process is usually clearly visible in the result. The candidate will understand the structure of a machine translation system, its weak points and possible solutions of arising problems.
Keywords
Approaches to machine translation - rule-based, statistical, neural; rule-based machine translation - analysis, transfer and synthesis; statistical machine translation - role of parallel corpora, language models; phrase-based and example-based translation; neural machine translation; sentence and word alignment; handling out of vocabulary words; evaluation of machine translation results - metrics.
Basic study material
Examiners
doc. Mgr. Pavel Rychlý, Ph.D., doc. RNDr. Aleš Horák, Ph.D.
Other recommended literature
  • P. Koehn, Statistical machine translation, Cambridge University Press, 2010.