Program kolokvií s abstrakty pro semestr Podzim 2011

27. 9. 2011

Mgr. Vojtěch Malínek , Ústav pro českou literaturu AV ČR Praha

Systém Retrobi

Abstrakt: Systém Retrobi je vyvíjen pro prezentaci dat lístkového katalogu Retrospektivní bibliografie české literární vědy, který s cca 1,75 mil. lístků patří mezi největší v ČR. Řeší zpracování vstupních dat od jejich seskenování až po vyvěšení do webové aplikace, tj. zejména automatickou detekci prázdných stran, spojení vícelístkových lístků a obohacení oskenovaných obrázků o jejich OCR přepisy a na ně navázané migrační procesy a kontrolní mechanismy. Webová aplikace je pak kromě základní prezentace obrázkových dat katalogu formou jednoduchého listování obohacena o možnost fulltextového vyhledávání nad OCR přepisy, online korekce vystavených textových dat registrovanými uživateli včetně administrátorského rozhraní pro jejich kontrolu a funkce hromadné editace a možnost v rámci systému vytvářet a dále zpracovávat libovolné uživatelské rešerše.

4. 10. 2011

RNDr. Tomáš Brázdil, Ph.D., FI MU

Jak se optimálně rozhodovat v náhodném prostředí

Abstrakt: V praxi jsme často nuceni dělat opakovaná rozhodnutí, jejichž důsledky můžeme pouze přibližně odhadovat. Příklady lze nalézt v mnoha oblastech od managementu přes řízení průmyslových procesů až po biologické experimenty. Ukazuje se, že kvalita rozhodnutí může být podstatně zvýšena využitím matematického modelování a analýzy. Markovovy rozhodovací procesy představují základní formalismus pro modelování právě takových rozhodnutí, která vykazují prvky kvantifikované nejistoty, tj. jsme schopni pouze odhadnout pravděpodobnosti možných důsledků. Jednoduchým příkladem je sázení v ruletě, kde sice předem neznáme výsledný zisk, ale jsme schopni odhadnout pravděpodobnosti možných hodnot. Optimální rozhodnutí lze učinit pouze tehdy, máme-li definovány hodnoty nebo ceny našich akcí (rozhodnutí). V této přednášce se zaměřím na rozhodovací procesy, v nichž je cílem maximalizovat určitou hodnotu akcí v dlouhodobém horizontu. Součástí přednášky bude prezentace nejnovějších výsledků z teorie Markovových rozhodovacích procesů, jejichž akce jsou oceněny vícedimenzionálními vektory reálných čísel, což umožňuje optimalizaci z více hledisek.

11. 10. 2011

doc. RNDr. Tomáš Pitner, Ph.D., FI MU

Imagine Cup: Soutěž talentů a uspěchy družstva FI na světovém fóru

Abstrakt: The Microsoft Imagine Cup (www.imaginecup.com) is the world’s premier student technology competition. It provides an opportunity for students to use their creativity, passion, and knowledge of technology to help solve global challenges. It is held annually in several categories mostly for teams but it brings opportunities also for talented individuals. The team Celebrio Software from the Lasaris research lab, Faculty of Informatics, succeeded in this year's world finals in New York taking the 7th to 18th place in the strong global competition of 67 teams in the finals of the prestigious Software Design category. The colloquium will try to bring a bit of the competition's goals, profile, and spirit to the Faculty in order to inspire the others to follow Celebrio. Particularly, the "life experiences" in terms of evaluation process, criteria, as well as examples of other successful projects and trends will be presented.

(The presentation will be held jointly with the Celebrio team members, in Czech or English.)

18. 10. 2011

prof. RNDr. Jiří Zlatuška, CSc., FI MU

Hodnocení výzkumu v informatice – přístupy a problémy

Abstrakt: Použitelné přístupy k hodnocení výzkumu v informatice mají oproti pžístupům ve starším a zavedenějších vědních disciplinách některé význačné rysy projevující se ve zvláštní úloze konferenčních příspěvků pro zveřejňování nových poznatků, ale i specifické oborové rysy výzkumu, který není čistě teoretickou disciplinou. Česká metodika hodnocení výzkumu je unikát, jehož negativní rysy na vědní systém jako celek, jsou dokumentovány nejen domácími oponenty, ale aktuálně i výsledky mezinárodního auditu hodnocení výzkumu a vývoje v ČR, kde lze specificky nalézt i hodnocení negativních efektů ztotožnění hodnocení s rozhodováním (resp. jeho nahrazením) stran rozdělování financí. Odlišnosti a metodologická východiska hodnocenív informatice , která pocházejí od amerického National Research Coucil, společnosti Computing Research Association a evropské Informatics Europe, poskytují sadu doporučení, kterou je možné využít jak pro identifikaci nevhodných rysů hodnotícího systému, tak pro specifikaci požadavků, které má splňovat systém použitelný. Větší dokumentovaná hodnocení ze zahraničí mohou sloužit jako případové studie uplatněné praxe v této oblasti.

25. 10. 2011

Prof. Dr. Ramin Yahyapour, GWDG Göttingen, Germany

Resource Management in Grid and Cloud Systems

Abstrakt: While Grids became a common production infrastructure for several scientific research disciplines, Cloud computing gained a broad customer based for main-stream commercial application. This talk addresses practical and theoretical scheduling problems for Grid systems and current work in the area of supporting service level management for cloud infrastructures. An outlook will be given on application scenarios for utilizing virtualization technologies in scientific infrastructures.

1. 11. 2011

Doc. Mgr. Vít Vondrák, Ph.D., VŠB-TU Ostrava, Centrum excelence IT4Innovations

Vývoj škálovatelných algoritmů pro řešení velmi náročných inženýrských úloh

Abstrakt: FETI (Finite Element Tearing and Interconnecting) metoda rozložení oblasti prvně představená Farhatem a Rouxem se ukazuje být jednou z nejúspěšnějších metod pro paralelní řešení lineárních úloh popisovaných eliptickými parciálními diferenciálními rovnicemi. Její hlavním rysem je rozložení oblasti na nepřekrývající se podoblasti, které jsou "slepeny" pomocí Lagrangeových multiplikátorů tak, že po eliminaci primárních proměnných je původní problém redukován na malou, relativně dobře podmíněnou úlohu kvadratického programování s lineárním omezením, která je dále řešena iteračně.

V prezentaci bude představena efektivní masivně paralelní implementace naší varianty FETI metody rozložení oblasti, kterou nazýváme Total FETI a její efektivita bude demonstrována na složitých inženýrských problémech jako jsou úlohy s materiálovými či geometrickými nelinearitami, nebo úlohy optimálního návrhu. Závěrem bude představena zcela nová varianta FETI metody, kterou nazýváme H-FETI (Hybridní FETI), která umožňuje paralelní implementaci na stovky tisíc jader.

8. 11. 2011

doc. RNDr. Petr Sojka, Ph.D., FI MU

The Art of Mathematics Retrieval

Abstrakt: The design and architecture of MIaS (Math Indexer and Searcher), a system for mathematics retrieval is presented, and design decisions are discussed. We argue for an approach based on Presentation MathML using a similarity of math subformulae. The system was implemented as a math-aware search engine based on the state-of-the-art system Apache Lucene and is used in The European Digital Mathematics Library - EuDML.

Scalability issues were checked against more than 400,000 arXiv documents with 158 million mathematical formulae. Almost three billion MathML subformulae were indexed using a Solr-compatible Lucene.

15. 11. 2011

Prof. RNDr. Radim Bělohlávek, DSc., PřF UP Olomouc

Formal concept analysis of data with fuzzy attributes: recent developments and related topics

Abstrakt: Formal concept analysis is a method of data analysis with roots in traditional, Port-Royal logic, and applications in various areas including software engineering, information retrieval, homeland security, or psychology. At the core of FCA is the mathematics and algorithms for relational data, in particular for closure structures, Galois connections, and finite partially ordered sets. In the basic setting, FCA works with binary data. The talk will provide an overview of an extension of FCA toward data with fuzzy (graded, ordinal) attributes whose foundations have been developed by the speaker and his group over the past ten years. Surveyed will be the basic structures behind FCA of data with fuzzy attributes, algorithms, and relationships to FCA of data with binary attributes. In addition, connections to some recent topics such as factor analysis of relational data and relational model of data over domains with similarities will be presented.

22. 11. 2011

Prof. Václav Rajlich, Wayne State University, Detroit

Current trends in development and teaching of software engineering

Abstrakt: This lecture reviews challenges and constrains that the instructor of a software engineering course faces. It argues that the best introduction into software engineering discipline is the training in the role of developers in a directed iterative process (DIP), where the most common task is software change (SC). In the course projects, students practice their skills by working on medium-sized open-source software systems, while the instructor supplies all the remaining DIP roles. A comprehensive overview of SC phases includes refactoring, concept location, impact analysis, unit testing, etc., and is the core of the course. At the end, the course briefly reviews the rest of the software engineering discipline.

The results show that this organization of the course gives students a more realistic experience than traditional software engineering courses. The course has been taught repeatedly at Wayne State University and the students have expressed a high level of satisfaction. The resources required by such a course are comparable to other computer science courses. A new textbook supporting this approach is introduced [1].

[1] Vaclav Rajlich, Software Engineering: The Current Practice, CRC Press, 2011

29. 11. 2011

Dr. Reinhold Huber-Mörk, Austrian Institute of Technology, Vienna

Automatic coin classification and identification

Abstrakt: We investigate object recognition and classification in a setting with a large number of classes as well as recognition and identification of individual objects of high similarity. Real-world data sets were obtained for the classification and identification tasks. The considered classification task is the discrimination of modern coins into several hundreds of different classes. Identification is investigated for hand-made ancient coins. Intra-class variance due to wear and abrasion vs. small inter-class variance makes the classification of modern coins challenging. For ancient coins the intraclass-variance makes the identification task possible, as the appearance of individual hand-struck coins is unique. We will present methods for coin image classification and identification and results for large data real-world data sets of modern and ancient coins.

6. 12. 2011

Prof. Herbert Edelsbrunner, IST Austria, Vienna

Alexander duality for functions

Abstrakt: Consider a decomposition of the (n+1)-sphere into spaced U and V whose intersection is an n-manifold, M. Alexander duality relates the homology of U with that of V, and using the Mayer-Vietoris exact sequence, we get a relation between the homology of M and U. This talk presents extensions of this classical version of Alexander duality to real-valued functions. One of the results is the following:

Let A be a compact set in R^{n+1}, let its boundary dA be an n-manifold, and let f: R^{n+1} --> R be a smooth function without critical points whose restriction to dA is tame. Then the persistence diagram of f restricted to dA is the disjoint union of the persistence diagram of f restricted to A and the reflection of this diagram.

Joint work with Michael Kerber.

13. 12. 2011

RNDr. Jan Pomikálek, Ph.D., FI MU

doc. PhDr. Karel Pala, CSc., FI MU

Web corpus in one click

Abstrakt: Text corpora have a wide range of applications in natural language processing. The source of data for corpora which has become very popular in the recent years is the web. However, there are many challenges related to web corpus creation, such as web crawling, character encoding detection, language identification, removing junk, and de-duplication. We believe we have found adequate solutions for all these problems. We have also developed software tools which make it possible to create a web corpus in a fully automated way for any language with a reasonable representation on the web and on Wikipedia in particular. Our latest experiments show that for "large" languages (such as English or Spanish) we can collect as much as one billion words of clean text without duplicities in a day by using a single powerful server. We can also easily create corpora for less-resourced languages, such as Tajik, though, of course, at a much lower speed. The automation of the web corpus creation can be such that the only required input is the Wikipedia code of the target language. In the presentation, we will describe our processing pipeline, discuss how some of the biggest challenges are addressed, and show our preliminary results.