Dimensionality Reduction of Vector Space Model for Information Retrieval using Simple Principal Component Analysis
-
- Kuroiwa Shingo
- Faculty of Engineering, The University of Tokushima
-
- Tsuge Satoru
- Faculty of Engineering, The University of Tokushima
-
- Shishibori Masami
- Faculty of Engineering, The University of Tokushima
-
- Fuji Ren
- Faculty of Engineering, The University of Tokushima
-
- Kita Kenji
- Center for Advanced Information Technology, The University of Tokushima
Bibliographic Information
- Other Title
-
- Simple PCAを用いたベクトル空間情報検索モデルの次元削減
- Simple PCA オ モチイタ ベクトル クウカン ジョウホウ ケンサク モデル ノ ジゲン サクゲン
Search this article
Description
In this paper, we propose to use the Simple Principal Component Analysis (SPCA) for dimensionality reduction of the vector space information retrieval model. The SPCA algorithm is a data-oriented fast method which does not require the computation of the variance-covariance matrix. In SPCA, principal components are estimated iteratively so we also propose a criteria to determine the convergence. The optimum number of iterations for each principal component can be determined using the criteria. Experimentally, we show that the SPCA-based method offers improvement over the conventional SVD-based method despite its small amount of computation. This advantage of SPCA can be attributed to its iterative procedure which is similar to clustering methods such as k-means clustering. On the other hand, the proposed method which orthogonalizes the basis vectors also achieved much higher accuracy than the conventional random projection method based on k-means clustering.
Journal
-
- IEEJ Transactions on Electronics, Information and Systems
-
IEEJ Transactions on Electronics, Information and Systems 125 (11), 1773-1779, 2005
The Institute of Electrical Engineers of Japan
- Tweet
Details 詳細情報について
-
- CRID
- 1390282679580608000
-
- NII Article ID
- 130000089848
-
- NII Book ID
- AN10065950
-
- ISSN
- 13488155
- 03854221
-
- NDL BIB ID
- 7694181
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL Search
- Crossref
- CiNii Articles
- KAKEN
- OpenAIRE
-
- Abstract License Flag
- Disallowed