Dimensionality Reduction of Vector Space Information Retrieval Model Based on Random Projection
-
- SASAKI MINORU
- Graduate School of Engineering, University of Tokushima
-
- KITA KENJI
- Faculty of Engineering, University of Tokushima
Bibliographic Information
- Other Title
-
- ランダム・プロジェクションによるベクトル空間情報検索モデルの次元削減
- ランダム プロジェクション ニ ヨル ベクトル クウカン ジョウホウ ケンサク モデル ノ ジゲン サクゲン
Search this article
Description
Vector space model is a conventional information retrieval model, in which text documents are represented as high-dimensional and sparse vectors using words as features in a multidimensional space. These vectors require a large number of computer resources and it is difficult to capture underlying concepts referred to by the terms.In this paper, we present a technique of an information retrieval model using a random projection to project document vectors to a low-dimensional space as a way of solving these problems. To evaluate its efficiency, we show results of retrieval experiments on the MEDLINE test collection. Experiments show that the proposed method is faster than LSI (Latent Semantic Indexing) and efficient close to the LSI. In addition, we propose to apply a concept vector, which random projection needs for dimensionality reduction, produced by a spherical κ-means algorithm. A result of this evaluation shows that the concept vector captures the underlying concepts of the corpus effectively.
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 8 (1), 5-19, 2001
The Association for Natural Language Processing
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390282679452831488
-
- NII Article ID
- 10008830167
-
- NII Book ID
- AN10472659
-
- DOI
- 10.5715/jnlp.8.5
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 5634280
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL Search
- Crossref
- CiNii Articles
-
- Abstract License Flag
- Disallowed