Simple PCAを用いたベクトル空間情報検索モデルの次元削減

書誌事項

タイトル別名
  • Dimensionality Reduction of Vector Space Model for Information Retrieval using Simple Principal Component Analysis
  • Simple PCA オ モチイタ ベクトル クウカン ジョウホウ ケンサク モデル ノ ジゲン サクゲン

この論文をさがす

抄録

In this paper, we propose to use the Simple Principal Component Analysis (SPCA) for dimensionality reduction of the vector space information retrieval model. The SPCA algorithm is a data-oriented fast method which does not require the computation of the variance-covariance matrix. In SPCA, principal components are estimated iteratively so we also propose a criteria to determine the convergence. The optimum number of iterations for each principal component can be determined using the criteria. Experimentally, we show that the SPCA-based method offers improvement over the conventional SVD-based method despite its small amount of computation. This advantage of SPCA can be attributed to its iterative procedure which is similar to clustering methods such as k-means clustering. On the other hand, the proposed method which orthogonalizes the basis vectors also achieved much higher accuracy than the conventional random projection method based on k-means clustering.

収録刊行物

参考文献 (8)*注記

もっと見る

関連プロジェクト

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ