A List by Author: David Novák
- e-mail:
- xnovak8(a)fi.muni.cz
- telephone:
- +420549495062
Employing Subsequence Matching in Audio Data Processing
by
Petr Volny,
David Novák,
Pavel Zezula,
September 2011, 29 pages.
FIMU-RS-2011-04.
Available as Postscript,
PDF.
Abstract:
We overview current problems of audio retrieval and time-series subsequence matching. We discuss the usage of subsequence matching approaches in audio data processing, especially in automatic speech recognition (ASR) area and we aim at improving performance of the retrieval process. To overcome the problems known from the time-series area like the occurrence of implementation bias and data bias we present a Subsequence Matching Framework as a tool for fast prototyping, building, and testing similarity search subsequence matching applications. The framework is build on top of MESSIF (Metric Similarity Search Implementation Framework) and thus the subsequence matching algorithms can exploit advanced similarity indexes in order to significantly increase their query processing performance. To prove our concept we provide a design of query-by-example spoken term detection type of application with the usage of phonetic posteriograms and subsequence matching approach.
LOBS: Load Balancing for Similarity Peer-to-Peer Structures
by
David Novák,
Pavel Zezula,
June 2007, 36 pages.
FIMU-RS-2007-04.
Available as Postscript,
PDF.
Abstract:
The real-life experience with the similarity search shows that this task is
both difficult and very expensive in terms of processing time. The peer-to-peer
structures seem to be a suitable solution for content-based retrieval in huge
data collections. In these systems, the computational load generated by a query
traffic is highly skewed which degrades the searching performance. Since no
current load-balancing techniques are designed for this task, we propose LOBS --
a novel and general system for load-balancing in peer-to-peer structures with
time-consuming searching. LOBS is based on the following principles: measuring the
computational load of the peers, separation of the logical and the physical
level of the system, and detailed analysis of the load source to exploit either
data relocation or data replication.
This report contains detailed description of the fundamentals and specific
algorithms of LOBS, a theoretical analysis of its behaviour, and results of
extensive experiments we conducted using a prototype implementation of LOBS.
We tested LOBS with the peer-to-peer structure \mchord{} having a various
number of peers. We used a real-life dataset and query sets of various
distributions. The results show that LOBS is able to cope with any
query-distribution and that it improves both the utilization of resources and
the system performance of query processing. The costs of balancing are
reasonable compared to the level of imbalance and are very small if the system has
time to adapt to a query-traffic. The behaviour of LOBS is independent
of the size of the network.