Colloquium of Faculty of Informatics

The Informatics Colloquium takes place on Tuesday at 14:30 during the term time. The aim of the colloquia is to introduce state of the art research from all areas of computer science to the wide audience of the Faculty of Informatics.

Time and place

Tuesday 14:30–15:30, lecture hall A217, Faculty of Informatics Building
(informal discussion with the speaker in A220 since around 14:00)

Spring 2024 - schedule overview

20/2 Tomáš Vojnar (FIT VUT) Counting Automata for Efficient Regex Matching and ReDoS Generation
27/2 Karel Kubíček (ETH Zurich) Automated Analysis and Enforcement of Consent Compliance
5/3 Juraj Somorovsky (University Padeborn) Lessons learned from the recent TLS attacks
12/3 Zdeněk Záhora (FF MU) A Decade of Teaching Digital Games, Game Design Principles and Connecting Local Game Industry Ecosystem With Academia
19/3 Coffee break with the faculty leaders Informal meeting with the dean and vice-deans.
26/3 Alberto Griggio (Fondazione Bruno Kessler) Formal verification in industry: a case study in automotive
2/4 Matěj Myška (PrF MU) Copyright in the context of generative AI
9/4 Milan Češka (FIT VUT) Herding Markov Cats: Effective Reasoning over Families of Probabilistic Systems
16/4 PhD Fest
23/4 Laura Sanita (Università Bocconi) On the Number of Degenerate Simplex Pivots
30/4 Bruno Rossi (FI MU) Software Reliability Engineering: from Software Reliability Models to Software Resilience - habilitation lecture
7/5 Cancelled
14/5 Luděk Matyska (FI MU) Large research infrastructures: Their role, importance and challenges
21/5 Libor Štěpánek (CJV MU) Mandatory Collegial Observations in Higher Education Setting: Idea, Implementation, and Experience (Teaching Lab lecture)

On Tuesday April 16, there will be PhD Fest - a series of talks given by PhD students.

Tomáš Vojnar
Counting Automata for Efficient Regex Matching and ReDoS Generation

February 20, 2024, 14:30, lecture hall A217

We start from a class of non-deterministic counting automata (CAs) with bounded counters that can be used to represent regular expressions (regexes) with bounded quantifiers, such as (ab){1,100}. Such regexes are quite common in practice (e.g., in rules used for detecting attacks in network traffic, when validating user inputs, when detecting information leakage, in XML schemas, and many other applications), with the quantifier bounds being sometimes tens or hundreds of thousands or even millions. We show that a large class of CAs arising from real-life regexes can be determinized using a novel kind of automata called counting-set automata (CsAs). These automata are equipped with registers that can hold sets of bounded integers and can be manipulated by a limited portfolio of constant-time operations. The size of the CsAs that one can obtain by determinization of CAs derived from regexes with bounded quantifiers does not depend on the quantifier bounds used in the regexes (while the size of the DFA is exponential to them and the size of a deterministic CA is linear). Subsequently, we show that CsAs can be used to obtain a regex matcher that can easily cope with regexes with bounded quantifiers where state-of-the-art matchers struggle. Further, building on our experience with how difficult handling of regexes with bounded quantifiers is for current highly-efficient and industrial-strength nonbacktracking regex matchers (such as grep, Google’s RE2, or even Intel’s Hyperscan), we sketch methods for generating input texts that can cause denial-of-service (DoS) attacks on these matchers – i.e., so-called, ReDoS (regex DoS) attacks. We illustrate the findings by extensive experiments with real-life regex matchers, both purely software-based as well as hardware-accelerated.

The presented results are joint work with Lenka Holíková, Lukáš Holík, Ivan Homoliak, Ondřej Lengál, and Juraj Síč from Brno University of Technology, and Olli Saarikivi and Margus Veanes from Microsoft Research, Redmond, USA.

Karel Kubíček
Automated Analysis and Enforcement of Consent Compliance

February 27, 2024, 14:30, lecture hall A217

This talk explores the adherence of widespread data collection practices to privacy regulations, specifically focusing on cookies and emails governed by the ePrivacy Directive and GDPR. We develop and employ machine learning and crawling methods to automate the detection of violations in both domains. Our analysis reveals an alarming prevalence of non-compliance in cookie notices (up to 95%) and significant rates of invalid consents to marketing emails (37%). We propose to address these issues by automated enforcement. For instance, our browser extension CookieBlock utilizes machine learning for deleting unwanted cookies on the client side, efficiently enforcing the user consent and addressing the observed violations. Finally, we propose to investigate the reasons for such large non-compliance levels: first, by a notification study raising awareness of the violations; second, by studying the legal intentions of open-source developers; and third, by inspecting the economic incentives for non-compliance.

Juraj Somorovsky
Lessons learned from the recent TLS attacks

March 5, 2024, 14:30, lecture hall A217

Transport Layer Security (TLS) is arguably the most important cryptographic protocol. Whether web, email, phone calls, chat, or VPN - there is hardly a type of communication which cannot be secured using TLS. Due to its importance, the protocol became target of many famous attacks, such as CRIME, DROWN, Logjam, or padding oracles. These attacks have devastating impact and allow, for example, to leak sensitive information such as cryptographic keys or confidential dog pictures. This talk gives an overview of our research on TLS attacks. Our attacks affect major TLS libraries and servers, and exploit different protocol properties ranging from outdated cryptographic algorithms to complex state machines and dangerous cryptographic shortcuts. With these examples, we shed light on various protocol misspecifications and problems.

Zdeněk Záhora
A Decade of Teaching Digital Games, Game Design Principles and Connecting Local Game Industry Ecosystem With Academia

March 12, 2024, 14:30, lecture hall A217

The aim of the talk is to sum up a decade of experience in between the local game industry (game design, narrative design, production) and academia (teaching, history, NGOs, research) in a conmprehensible form suited for anyone interested in the broad field of game studies. During the talk I will discuss few problems: (a) the subjectivity of game design goal, (b) the ambivalence of student team based projects in day-to-day teaching, (c) the trap of game design genre experience non-transivity, (d) the issue of non-existing game design ontology.

Coffee break with the faculty leaders
Informal meeting with the dean and vice-deans.

March 19, 2024, 14:00, KYPO

Alberto Griggio
Formal verification in industry: a case study in automotive

March 26, 2024, 14:30, lecture hall A217

Automated driving functions are among the most critical software components to develop. Before deployment in series vehicles, it has to be shown that the functions drive safely and in compliance with traffic rules. Despite the coverage that can be reached with very large amounts of test drives, corner cases remain possible and potential logical errors must be found as early as possible. Furthermore, the development is subject to time-to-delivery constraints due to the highly competitive market. We describe an approach to improve the development of an actual industrial behavior planner for the Automated Driving Alliance between Bosch and Cariad. The original process landscape for verification and validation is extended with formal methods. The idea is to integrate automated extraction mechanisms that, starting from the C++ code of the planner, generate a higher-level model of the underlying logic. This model, composed in closed loop with expressive environment descriptions, can be exhaustively analyzed with automated formal verification techniques. This results, in case of safety violations, in traces that can be re-executed in system simulators to guide the search for errors. The approach was exemplarily deployed in series development, and successfully found relevant issues in intermediate versions of the planner at development time.

Matěj Myška
Copyright in the context of generative AI

April 2, 2024, 14:30, lecture hall A217

"Generative artificial intelligence (AI) presents a significant challenge to copyright law, raising fundamental questions about authorship, ownership, and infringement. This talk will explore the evolving landscape of copyright in the context of generative AI, addressing key principles, priorities, and practical considerations. It will examine the tension between AI as a mere tool and as an author, the implications for copyrightability, and the potential for infringement. The discussion will also encompass the need for collaboration, adaptability, and the preservation of human creativity in the face of mechanistic outputs. By delving into these complex issues, the talk aims to provide insights into the intersection of generative AI and copyright law, offering guidance for stakeholders and policymakers in navigating this rapidly evolving field." This abstract was generated by when prompted "Please create an abstract for an academic talk entitled "Copyright Aspects of Generative AI"". Is this output copyrighted? Who is the author? Are there any pertaining rights to this output? Who could exercise them and how is this regulated contractually in the Terms and Conditions of the most popular providers of the AI services? Is the training of the AI system copyright infringement? In my talk I will focus exactly on these questions within the EU and CZ regulatory framework. The cases currently handled by US courts will be also discussed.

Milan Češka
Herding Markov Cats: Effective Reasoning over Families of Probabilistic Systems

April 9, 2024, 14:30, lecture hall A217

Markov models provide a fundamental formalism to deal with uncertainty and to support automated decision making. A classical problem is to verify whether a given model satisfies the specification or to synthesise a policy ensuring the desired behaviour. However, at design time, one often needs to consider a family of models describing many variations of a system under design. In this talk, we present a novel paradigm allowing for effective reasoning about large families of Markov models. It builds on symbolic representation of the family leveraging structural similarities among the models and on two exploration strategies: i) deductive reasoning via a suitable abstraction and ii) inductive reasoning using counter-examples and their generalisation. We demonstrate that this approach provides a competitive alternative for a wide range of important problems, including the synthesis of finite-state probabilistic programs, controller synthesis for partially observable systems, and the synthesis of compact decision trees. Moreover, it can naturally integrate orthogonal techniques such as belief exploration or reinforcement learning techniques, thus paving the way towards automated construction of robust, verifiable and explainable systems. Within the talk, we will present several case studies to showcase our approach implemented in the tool called~PAYNT.

PhD fest - Presentations given by Ph.D. students of FI MU

April 16, 2024, 14:30, lecture hall A217
Bedřich Said - Enhancing Sensor Data Reliability through Unsupervised Semantic-Blind Analysis

The reliability of sensors used in real-time systems is critical for the correct behavior of the system. There are several approaches for detection or correction of failures. In this work, I focus on detection of the failures from the measured data. The input data are k-dimensional vectors located in n-dimensional space. For example, a set of RGB colors of pixels are these k-dimensional vectors located in a 2D space of the image. The input data do not have to be aligned into a grid. In this research, I am looking at the input data without any other knowledge or known parameters (semantic-blind). The failure detection method tries to describe the measured data and check if this description fits into the description provided by the user. If the description differs from the awaited results, we mark the data as possibly corrupted. This means that we need an external description of the set of sensors and their measurements. If the user does not provide any description, we can use an automatically generated description from historical data. The described input data usually have different properties in different regions, so we cannot use one description for the whole dataset. This means that we need a tool for segmentation of the measured data. The segmentation tool must be able to work with any input data without specification of any parameters. I present a new segmentation algorithm designed for this purpose.

Eva Výtvarová - Topological network analysis of human brain function

Network analysis is a data-driven technique and an excellent tool for studying natural systems; it requires interdisciplinary and often big data processing approaches. The human brain is such a system in both structure and function. Network analysis provides ways to measure the importance of nodes, detect communities, and describe global topology. Dynamic behavior can be addressed as well. Using this framework of complex networks, coordinated co-activation of the human brain at rest and during tasks has been thoroughly studied as functional connectivity (Bullmore and Sporns, 2009), which has been shown to be altered in many conditions, including psychiatric illnesses, substance abuse, stress, or brain diseases. In this talk, I will present our contributions to the field, concentrating on topological functional connectivity changes across diverse neurodegenerative diseases.

Laura Sanita
On the Number of Degenerate Simplex Pivots

April 23, 2024, 14:30, lecture hall A217

The simplex algorithm is one the most popular algorithms to solve linear programs (LP). Starting at an extreme point solution of an LP, it performs a sequence of basis exchanges (called pivots) that allow to move to a better extreme point along an improving edge-direction of the underlying polyhedron. A key issue in the simplex algorithm performance is degeneracy, which may lead to a (potentially long) sequence of basis exchanges which do not change the current extreme point solution. In this paper, we prove that it is always possible to limit the number of consecutive degenerate pivots that the simplex algorithm performs to n−m−1, where n is the number of variables and m is the number of equality constraints of a given LP in standard form.

Bruno Rossi
Software Reliability Engineering: from Software Reliability Models to Software Resilience

April 30, 2024, 14:30, lecture hall A217

In this habilitation lecture, we will be looking at a historical overview of software reliability engineering. While reliability is a key quality of software systems, in recent years there has been a focus shift towards software resilience. In this lecture, we will delve into three main areas. First, we will discuss software reliability growth models, historical models to predict software failures based on testing efforts. Next, we will discuss software quality models proposed over time to control and reduce software defects, leading to fewer failures. Finally, we will explore current trends for building resilient software systems capable of withstanding failures and providing self-healing capabilities.


May 7, 2024, 14:30, lecture hall A217

Luděk Matyska
Large research infrastructures: Their role, importance and challenges

May 14, 2023, 14:30, lecture hall A217

This presentation delves into the fundamental aspects of Large Research Infrastructures (LRIs) and their critical role in advancing modern science. Initially recognized as large-scale facilities housing cutting-edge equipment, LRIs have evolved to encompass widely distributed communities with specialized resources. This broad scope enables LRIs to address complex research inquiries and facilitate diverse scientific initiatives and projects, establishing them as central hubs for large-scale exploration. LRIs significantly contribute to exceptional research by providing access to state-of-the-art resources, promoting interdisciplinary collaboration, and facilitating data-intensive investigations. Moreover, this presentation examines the socio-economic impacts of LRIs, including their contributions to regional development, talent retention, and industry partnerships. The evolving landscape of LRIs is surveyed, encompassing their organizational structures and governance at both European (ESFRI) and national levels. Challenges such as funding sustainability, operational efficiency, and equitable access are discussed, shedding light on the complexities facing LRIs in their pursuit of scientific excellence and societal impact.

Libor Štěpánek
Mandatory Collegial Observations in Higher Education Setting: Idea, Implementation, and Experience

May 21, 2023, 14:30, lecture hall A217

This interactive talk aims to present and share experience with the introduction and organization of mandatory collegial observations within the context of Higher Education institutions. Drawing on almost a decade of practice, the initial idea, early-stage implementation, its later evolution as well as impact of observations on teachers´ work will be addressed and discussed.

Past colloquia