Speeding Up String Pattern Matching by Text Compression : The Dawn of a New Era

この論文をさがす

抄録

This paper describes our recent studies onstring pattern matching in compressed textsmainly from practical viewpoints.The aim is to speed up the string pattern matching task in comparison with an ordinary search over the original texts.We have successfully developed (1) an AC type algorithmfor searching in Huffman encoded files and (2) a KMP typealgorithm and (3) a BM type algorithm for searchingin files compressed by the so-called byte pair encoding (BPE).Each of the algorithms reduces the search timeat nearly the same rate as the compression ratio.Surprisingly the BM type algorithm runs over BPE compressed filesabout $1.2$--$3.0$ times faster thanthe exact match routines of the software package {?tt agrep} which is known as the fastest pattern matching tool.

This paper describes our recent studies onstring pattern matching in compressed textsmainly from practical viewpoints.The aim is to speed up the string pattern matching task,in comparison with an ordinary search over the original texts.We have successfully developed (1) an AC type algorithmfor searching in Huffman encoded files, and (2) a KMP typealgorithm and (3) a BM type algorithm for searchingin files compressed by the so-called byte pair encoding (BPE).Each of the algorithms reduces the search timeat nearly the same rate as the compression ratio.Surprisingly, the BM type algorithm runs over BPE compressed filesabout $1.2$--$3.0$ times faster thanthe exact match routines of the software package {\tt agrep}, which is known as the fastest pattern matching tool.

収録刊行物

被引用文献 (6)*注記

もっと見る

参考文献 (35)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ