Zpět
Nice phrases for papers
General
- Preliminaries and Related Work
- aims to advance the state of the art in ...
- From another perspective, sth. can be viewed as ...
Information Fusion
- Multimodal information fusion aims at interpreting jointly multiple sources of information representing
the same underlying „concept“. The main goal is the extraction of information.
- By fusing information, one aims at: 1) being more accurate in the discovery of the
„concept“ (each individual stream may be incomplete); and 2) being more robust in the discovery of the „concept“
(each individual stream may be distorted - noisy etc.)
Reasons for annotation
- web content structuring and retrieval
- Although considerable research efforts have been devoted to concept and event detection in public and private images, this task remains difficult because the number of possible elements that can be depicted is boundless, where their visual aspects furthermore can vary along numerous dimensions.
- Image annotation techniques assign metadata, usually keywords, to images automatically, which makes it easier for
indexing and maintaining large collections of images and
thus has been studied actively in the last few decades, particularly in image retrieval
- The ever increasing Internet image collection densely samples the real world objects, scenes, etc. and is commonly accompanied with multiple metadata such as textual descriptions and user comments. Such image data has potential to serve as a knowledge source for large-scale image applications. Facilitated by such publically available and ever-increasing loosely annotated image data on the Internet, we propose a scalable data-driven solution for annotating and retrieving Web-scale image data.
- Image classification and image annotation are two classical problems in computer vision. Given an image, image
classification tells people what is the theme of the image
(high-level semantic meaning), and image annotation tells
people what objects are inside the image and their properties (tags for image component descriptions).
- Although it has been extensively studied for many years, automatic image annotation is still a challenging problem. Recently, data-driven approaches have demonstrated their great success to image auto-annotation. Such approaches leverage abundant partially annotated web images to annotate an uncaptioned image. Specifically, they first retrieve a group of visually closely similar images given an uncaptioned image as a query, then figure out meaningful phrases from the surrounding texts of the image search results. Since the surrounding texts are generally noisy, how to effectively mine meaningful phrases is crucial for the success of such approaches.
- Due to the large number of web images, it is crucial to
develop techniques to quickly navigate users to their interested images, and image search is one of such techniques.
Among various types of querying methods [15], [43],
the query-by-keyword (QBK) image search scheme dominates commercial search engines such as Google, Bing,
and Yahoo!.
A key hinder factor of QBK is that a large number of web
images do not have textual descriptions or their texts do not
describe the content. To solve this problem, researchers
have worked for a decade on automatic image annotation
which generates texts that describe the semantics of an
image.
The key challenge of image annotation is the semantic
gap between visual features and semantic concepts. No
semantic metrics have been successfully defined based on
existing visual features such as color, texture, and local
features.To tackle the semantic gap problem,various
techniques have been proposed. Traditional approaches
generally work on small-scale databases with limited vocabularies [20], [28], [37], whereas recent advent of the Web
inspires research on learning from partially and noisily
labele web images, among which search-based annotation
is a remarkable branch [1], [5], [40], [41].
The basic idea of search-based annotation is as follows:
imagining there is a well annotated and unlimited image
database that given any uncaptioned images, we can find
their duplicates. Then, an image can be annotated simply
by propagating the texts from its duplicates. In a more
realistic case of limited databases, we can instead search
for closely similar images (of which duplicate is a special
case), extract some texts from the textual descriptions of
the image search results, and use the most salient texts to
annotate the query image. (from Duplicate-Search-Based Image Annotation Using Web-Scale Data)
Annotations - state of the art
- Recent work on CBIR can be divided between specialized
annotators that limit their domain (Vailaya, Jain, and Zhang
1998) (Szummer and Picard 1998) or generalized annotators (Mnller et al. 2004) (Li and Wang 2008) that sacrifice
precision for broad applicability. In general annotators, an
ensemble of specialized annotators could provide precision
while expanding the applicable domains. The challenge is in
automatically selecting appropriate specialized annotators.
- The success of media sharing websites has led to the availability of large collections of images tagged with human-provided labels.
Conclusions
- Possible extensions of this work include the exploration of how...