X-Git-Url: https://www.fi.muni.cz/~kas/git//home/kas/public_html/git/?a=blobdiff_plain;f=pan13-paper%2Fpan13-notebook.tex;fp=pan13-paper%2Fpan13-notebook.tex;h=2afc33564df960487dfdeae0b258a7fae3c2fff9;hb=ebba97ad24be305e65ceb7cfdbb34d54d9a6bfba;hp=8adaa7fbe8b4bb50701a04f4498a2289c6fa0aa5;hpb=9e3bea6abbc34854e6fc92ba08c2200290e685cd;p=pan13-paper.git diff --git a/pan13-paper/pan13-notebook.tex b/pan13-paper/pan13-notebook.tex index 8adaa7f..2afc335 100755 --- a/pan13-paper/pan13-notebook.tex +++ b/pan13-paper/pan13-notebook.tex @@ -30,11 +30,11 @@ with the described modifications, and further improvement is still possible. \section{Introduction} In PAN 2013 competition on plagiarism detection we participated in both the Source Retrieval -and the Text Alignment subtask. In both tasks we adapted methodology used in PAN 2012\footnote{% +and the Text Alignment subtasks. In both tasks we adapted methodology used in PAN 2012\footnote{% See \cite{pan2012} for an overview of PAN 2012 plagiarism detection campaign.} \cite{suchomel_kas_12}. Section~\ref{source_retr} describes querying approach for source retrieval, where we used three different types of queries. We present a new type of query based on text paragraphs. -The query execution were controled by its type and by preliminary similarities +The query execution was controlled by its type and by preliminary similarities discovered during the searches. Section~\ref{text_alignment} describes our approach for the text alignment (pairwise comparison) subtask. We briefly introduce our system, @@ -48,8 +48,8 @@ the results achieved and further development. \section{Conclusions} -We introduces querying strategy with snippet similarity measure which approved to be -competitive. In source retrieval subtask the strategy performed with the second best ratio +We introduced querying strategy with snippet similarity measure. %which approved to be competitive. +In source retrieval subtask the strategy performed with the second best ratio of recall to the number of used queries. We focused our queries on selected parts of text and on parts with no discovered external similarities.