ランダム・プロジェクションによるベクトル空間情報検索モデルの次元削減

書誌事項

タイトル別名
  • Dimensionality Reduction of Vector Space Information Retrieval Model Based on Random Projection
  • ランダム プロジェクション ニ ヨル ベクトル クウカン ジョウホウ ケンサク モデル ノ ジゲン サクゲン

この論文をさがす

抄録

Vector space model is a conventional information retrieval model, in which text documents are represented as high-dimensional and sparse vectors using words as features in a multidimensional space. These vectors require a large number of computer resources and it is difficult to capture underlying concepts referred to by the terms.In this paper, we present a technique of an information retrieval model using a random projection to project document vectors to a low-dimensional space as a way of solving these problems. To evaluate its efficiency, we show results of retrieval experiments on the MEDLINE test collection. Experiments show that the proposed method is faster than LSI (Latent Semantic Indexing) and efficient close to the LSI. In addition, we propose to apply a concept vector, which random projection needs for dimensionality reduction, produced by a spherical κ-means algorithm. A result of this evaluation shows that the concept vector captures the underlying concepts of the corpus effectively.

収録刊行物

  • 自然言語処理

    自然言語処理 8 (1), 5-19, 2001

    一般社団法人 言語処理学会

被引用文献 (6)*注記

もっと見る

参考文献 (19)*注記

もっと見る

詳細情報

問題の指摘

ページトップへ