Ilhan Karić, Zanin Vejzović



Quasilinear-Time Search and Comparison for Sequential Data

pdf PDF



This paper proposes a new algorithm for the evaluation of similarity between two sequences in quasilinear time. It describes the theoretical, practical and implementational aspects of the algorithm. The proposed method is a new approach dedicated to the computation of sequential similarity in contrast to other methods like the Jaccard Index which although designed for the computation of similarity of sets have been frequently used on sequences. The method is generalizable and applicable to any form of sequential data of a finite alphabet (binary files, DNA sequences, natural language etc.)



Sequence Similarity, Comparison, Contextual Similarity, Quasilinear-Time Complexity



[1] X1. Konrad Rieck, Pavel Laskov, “Linear-Time Computation of Similarity Measures for Sequential Data”, Journal of Machine Learning Research 9 (2008) 23-48 pp. 1

[2] S. T. Piantadosi, “Zipf’s word frequency law in natural language: A critical review and future directions”, Psychonomic Bulletin & Review, vol. 21, 2014, pp. 1112-1130 

Cite this paper

Ilhan Karić, Zanin Vejzović. (2017) Quasilinear-Time Search and Comparison for Sequential Data. International Journal of Computers, 2, 161-165


Copyright © 2017 Author(s) retain the copyright of this article.
This article is published under the terms of the Creative Commons Attribution License 4.0