+The second main decisive area about source retrieval task is to decide which from the search engine results to download.\r
+This process is represented in figure~\ref{fig:source_retr_process} as 'Selecting'. \r
+Nowadays in real-world is download very cheap and quick operation. There can be some disk space considerations\r
+if there is a need to store original downloaded documents. The main cost represents documents post processing. \r
+Mainly on the Internet there is a wide range of file formats, which for text alignment must be\r
+converted into plaintext. This can time and computational-consuming. For example from many\r
+pdf documents the plain text is hardly extractable, thus one need to use optical character recognition methods.\r
+\r
+The ChatNoir offers snippets for discovered documents. The snippet generation is considered costless\r
+operation. The snippet purpose is to have a quick glance at a small extract of resulting page.\r
+The extract is maximally 500 characters long and it is a portion of the document around given keywords.\r
+On the basis of snippet, we needed to decide whether to actually download the result or not.\r
+\r
+Since the snippet is relatively small and it can be discontinuous part of the text, the \r
+text alignment methods described in section~\ref{text_alignment} were insufficient for \r
+\r
+\r