不一致を許す文字列照合のためのFFTを用いた確率的アルゴリズムの精度評価

中藤, 哲也, 馬場, 謙介, 池田, 大輔, 森, 雅生, 廣川, 佐千男

書誌事項

タイトル別名

フイッチオユルスモジレツショウゴウノタメノ FFT オモチイタカクリツテキアルゴリズムノセイドヒョウカ
Accuracy Evaluation of FFT-based Randomized Algorithms for String Matching with Mismatches

この論文をさがす

抄録

テキスト中から与えられたパターンを見つけ出す文字列照合問題は，Webの情報検索やDNA配列の特定パターンの検索に用いられるなど，幅広い応用範囲を持つ．パターンの編集に置換のみを許した近似文字列照合は，不一致を許す文字列照合と呼ばれ，テキスト全域での一致スコアを求めるために，正確な一致場所を求める文字列照合よりも計算量が大きい．この問題の解法として，高速フーリエ変換（FFT）を利用した高速な確率的アルゴリズムがいくつか提案されており，それらは文字から数値への写像の生成方法により，写像の総数と，得られる推定値の精度が異なる．我々の提案するアルゴリズム10)は写像の総数が理論上での最小であり，精度も提案されているアルゴリズム中で最も高い．本稿では，Atallah らのアルゴリズム1)による推定値の精度と実験的な比較を行い，提案アルゴリズムの推定値の精度がより高いことを確認した．

String matching is the problem of finding all occurrences of a given pattern string in a given text string. It is applicable to a wide range of fields, such as Web information retrieval and pattern discovery of DNA sequences. The string matching with mismatches allows inexact match with substitution and has high complexity. In order to solve the problem several fast randomized algorithms have been proposed. They use the fast Fourier transformation (FFT). All of these algorithms introduce a certain number of mappings that convert symbols into numbers. The total number of such mappings and variance of estimates depends on the method to generate the mappings. This paper proposes an algorithm that achieves the theoretically minimum number of mappings and yields accurate estimates. Empirical evaluation is conducted to compare the accuracy of estimates of the proposed algorithm with that of Atallah et al. It is confirmed that the accuracy of the proposed algorithm is better.

収録刊行物

情報処理学会論文誌データベース（TOD）

情報処理学会論文誌データベース（TOD） 2 (4), 24-31, 2009-12-24

Information Processing Society of Japan (IPSJ)

詳細情報詳細情報について

CRID: 1050564287852571136

NII論文ID: 110007990064

NII書誌ID: AA11464847

ISSN: 18827799; 18827772; 03875806

HANDLE: 2324/1271101

NDL書誌ID: 024306326

Web Site: http://id.nii.ac.jp/1001/00067273/; https://ndlsearch.ndl.go.jp/books/R000000004-I024306326

本文言語コード: ja

資料種別: article

データソース種別

IRDB
NDL
CiNii Articles
KAKEN

不一致を許す文字列照合のためのFFTを用いた確率的アルゴリズムの精度評価

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (1)*注記

関連プロジェクト

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

不一致を許す文字列照合のためのFFTを用いた確率的アルゴリズムの精度評価

書誌事項

この論文をさがす

抄録

収録刊行物

被引用文献 (1)*注記

関連プロジェクト

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

参加プロジェクトリスト

詳細情報詳細情報について