Colloquium of Faculty of Informatics

The Informatics Colloquium takes place on Tuesday at 14:30 during the term time. The aim of the colloquia is to introduce state of the art research from all areas of computer science to the wide audience of the Faculty of Informatics.

Time and place

Tuesday 14:30–15:30, lecture hall A217, Faculty of Informatics
(speaker available for informal discussion since around 14:00 in room A220)



Autumn 2023 - schedule overview

26/9 Ivan Zelinka ( VSB - Technical University of Ostrava) Visualization and analysis of malware through fractal geometry
3/10 Jiří Šimša (Google) Data Pipelines for Machine Learning
10/10 Robert Ganian (TU Wien) Mapping the Complexity-Theoretic Landscapes of Artificial Intelligence
17/10 Ernest Cachia (University of Malta) Tools for improving data accessibility in Bioinformatics
24/10 Zdeněk Dvořák (Charles University) Flow-critical graphs
31/10 Jakub Gajarský (University of Warsaw) Designing efficient graph algorithms using logic
7/11 Louis Esperet (CNRS Grenoble) TBA
14/11 PhD Fest
21/11 Peer Kröger (University of Kiel) TBA
28/11 Phil Newton (Swansea University) Evidence-Based Learning, Teaching and Assessment
5/12 Lukáš Sekanina (Brno University of Technology) Evolutionary Design for Computer Engineering
12/12 Miroslav Svítek (Czech Technical University in Prague) TBA

On Tuesday, November 14, there will be a series of two talks given by our PhD students.


Ivan Zelinka
Visualization and analysis of malware through the lens of fractal geometry

Tuesday 26 September 2023, 14:30, lecture hall A217

To date, a large number of research papers have been written on malware classification, identification, classification into different families, and the distinction between malware and goodware. These works were based on captured malware samples and sought to analyze malware and goodware using various techniques, including artificial intelligence techniques. Some of these works also looked at malware analysis using malware visualization. These works typically convert malware samples capturing the structure of the malware into image structures that are then subjected to image processing. Visualizations of this type usually result in images showing black and white granularity, and artificial intelligence methods for classification are then applied to these images. In this lecture, we propose a very unconventional and new approach to malware visualization based on fractal geometry, where visually very interesting images are subsequently used to classify malware and goodware. Our approach opens up a vast topic for future discussion and provides many new directions for research in malware analysis and classification, as discussed in the conclusion. Above all, however, it opens up a new area of computer code visualization using fractal geometry, which offers a theoretically infinite number of different representations of the code, and thus also the "uniqueness" of the display. The results of the fractal conversion and subsequent classification experiments presented are based on a database of 6,589,997 goodware, 827,853 potentially unwanted applications, and 4,174,203 malware samples provided by ESET. Therefore, this lecture is not a comprehensive compact study that would present the results obtained from comparative experiments, but rather tries to show a new direction in the field of visualization using fractal geometry and its possible use in malware analysis.


Jiří Šimša
Data Pipelines for Machine Learning

Tuesday 3 October 2023, 14:30, lecture hall A217

In this talk we present Google's infrastructure for machine learning (ML) data pipelines. The talk is divided into two parts:

1) tf.data overview -- data pipelines for ML jobs are often challenging to implement efficiently as they require reading large volumes of data, applying complex transformations, and transferring data to hardware accelerators while overlapping computation and communication to achieve optimal performance. We present tf.data, a framework for building and executing efficient data pipelines for ML jobs. The tf.data API provides operators which can be parameterized with user-defined computation, composed, and reused across different machine learning domains. These abstractions allow users to focus on the application logic of data processing, while tf.data's runtime ensures that pipelines run efficiently.

2) tf.data service overview -- traditionally, data pipelines of ML jobs execute on the same host as the ML computation. The data processing can however become a bottleneck of the ML computation if there are insufficient resources (e.g. CPU and memory bandwidth) to process data fast enough. This can slow down the ML computation and waste valuable and scarce ML hardware (e.g. GPUs and TPUs) used by the ML computation. We present tf.data service, a disaggregated data processing service built on top of tf.data, along with (1) empirical evidence based on production workloads for the need of disaggregation, as well as quantitative evaluation of the impact disaggregation has on the performance and cost of production workloads, (2) benefits of disaggregation beyond horizontal scaling, (3) analysis of tf.data service's adoption at Google, the lessons learned during building and deploying the system and potential future lines of research opened up by our work.


Robert Ganian
Mapping the Complexity-Theoretic Landscapes of Artificial Intelligence

Tuesday 10 October 2023, 14:30, lecture hall A217

Over the past few decades, there has been a concentrated and highly successful scientific effort aimed at identifying the boundaries of theoretical tractability for classical computational problems such as Boolean Satisfiability, Model Checking, Constraint Satisfaction and a variety of fundamental graph problems. Indeed, today we can use the parameterized refinement of complexity theory along with a variety of other cutting-edge tools to draw detailed complexity-theoretic landscapes capturing when these problems are tractable and when they are, at least under well-established assumptions, "hard". However, the rapid ascent of Artificial Intelligence has led to the identification of entirely new kinds of important computational problems, and our understanding of when these can be solved efficiently is in many cases still in its infancy.

In this high-level talk, we will explore selected new developments in mapping the landscapes of tractability for fundamental problems in several subfields of Artificial Intelligence, including recommender systems and machine learning. The talk will include an introduction to the foundations of parameterized complexity theory and will cover new algorithms and lower bounds for matrix completion problems as well as the recently introduced parameterized refinement of PAC-learning.


Ernest Cachia
Tools for improving data accessibility in Bioinformatics

Tuesday 17 October 2023, 14:30, lecture hall A217

Our research in bioinformatics focuses primarily on proteins. We have developed programs to predict protein function (related paper in progress) and to provide a unified view of proteins by integrating several databases into a one-stop-shop solution (SADIP). SADIP is being revised to optimise the architecture, and we are writing a paper for it. The idea behind SADIP is to provide a portal that provides as much information as possible about the protein, such as any identified domains (with data drawn from SCOP and CATH), disordered regions, moonlighting, etc. We also have a prototype, which requires further improvement, which analyses scientific literature and uses ontologies to markup information retrieved from literature to map links between proteins, genes, diseases and drug interactions. Finally, our research also focuses on the FAIR assessment of resources (the paper is published, and the tool is available at https://autofair.research.um.edu.mt/portal/home/). We have strong collaborations with the Centre for Molecular Medicine & Biobanking, who are also spearheading Bioinformatics-related projects at UM.

Our future direction involves optimising and extending our current projects and applying our expertise to problems in the biomedical sphere. An example of one such project completed in the past is a tool to track the progress of Alzheimer's disease through an analysis of MRI images. This is an automated process that aids physicians in their diagnostic process. We are keen to collaborate in areas where datasets are available and where we can form a research problem in some reasonable detail. We can work in several scenarios through small exploratory projects with a group of undergraduate students (3-4 students, but the caveat is that the problem needs to be very well defined), an undergraduate research student or with students at Master level. This can be discussed on a case-by-case basis.


Zdeněk Dvořák
Flow-critical graphs

Tuesday 24 October 2023, 14:30, lecture hall A217

Many important results in graph coloring were obtained through the study of critical graphs, i.e., minimal obstructions to having a certain chromatic number. By a famous result of Tutte, nowhere-zero flows are dual to coloring, and it is natural to ask whether a progress on some of the many open questions concerning nowhere-zero flows could not be made through the study of suitably defined flow-critical graphs. The talk will survey recent results and open questions on this topic.


Jakub Gajarský
Designing efficient graph algorithms using logic

Tuesday 31 October 2023, 14:30, lecture hall A217

We will focus on a logic-based approach to designing efficient algorithms for many basic graph theoretic problems (such as k-dominating set, k-independent set, subgraph isomorphism, and many of their variants). The basic idea is simple – instead of manually designing the algorithm for the problem we are trying to solve, we derive the algorithm for the problem automatically from its definition (essentially, we can have an algorithm for producing algorithms).

This area of research has been intensively studied in the past 25 years and recently underwent major developments. First, we will give a high-level overview of this research area, and then we will focus on a new result that tells us when we can improve upon the existing state-of-the-art results in this field, such as the seminal result of Grohe, Kreutzer and Siebertz. Along the way, we will give a brief introduction to basic notions of structural theory of sparse graphs, with the hope of making them accessible to the general computer science audience.


Louis Esperet
TBA

Tuesday 7 November 2023, 14:30, lecture hall A217

TBA


PhD fest - Presentations by local Ph.D. students

Tuesday 14 November 2023, 14:30, lecture hall A217
Oldřich Pecák

TBA

Dávid Halász
Trust Building via Adaptive Safety in Autonomous Ecosystems

The evolving landscape of autonomous cyber-physical systems is progressing towards cooperative ecosystems. These complex structures, characterized by dynamic interactions among various autonomous systems, offer heightened autonomy and adaptability but pose substantial challenges in ensuring safety. The swift development of autonomous driving underlines the urgency to address these issues. Existing safety assurance methods, while effective on an individual level, struggle to encompass the complexities of dynamic ecosystems. To bridge this gap, this lecture advocates for adaptive safety mechanisms informed by trust and trustworthiness between ecosystem members and proposes a method ensuring adaptive safety.


Peer Kroger
TBA

Tuesday 21 November 2023, 14:30, lecture hall A217

TBA


Phil Newton
Evidence-Based Learning, Teaching and Assessment

Tuesday 28 November 2023, 14:30, lecture hall A217

In this session we will cover the basic principles of how humans learn, and how those principles can be translated into teaching and assessment strategies for academics and learning strategies for students. We will review some common teaching strategies and consider the evidence on whether or not they are effective, and how best to use them (or not!).


Lukáš Sekanina
Evolutionary Design for Computer Engineering

Tuesday 5 December 2023, 14:30, lecture hall A217

Evolutionary design, i.e., the use of evolutionary algorithms for the automated creation of programs, electronics circuits, antennas, robots, and other objects, has become a fruitful approach in computer science and engineering in the last two decades. This talk surveys the key ingredients of evolutionary design methods and presents several techniques for improving the scalability of this approach. Examples of evolved solutions (such as approximate arithmetic circuits, CNN architectures, and image filters) that show unique properties compared to conventional designs will be presented.


Miroslav Svítek
TBA

Tuesday 12 December 2023, 14:30, lecture hall A217

TBA


Past colloquia