Two-dimensional clustering for text categorization

Yuji Matsumoto, Hiroya Takamura

doi:10.3115/1118853.1118881

説明

We propose a new method to improve the accuracy of Text Categorization using two-dimensional clustering. In a number of previous probabilistic approaches, texts in the same category are implicitly assumed to be generated from an identical distribution. We empirically show that this assumption is not accurate, and propose a new framework based on two-dimensional clustering to alleviate this problem. In our method, training texts are clustered so that the assumption is more likely to be true, and at the same time, features are also clustered in order to tackle the data sparseness problem. We conduct some experiments to validate the proposed two-dimensional clustering method.

収録刊行物

proceeding of the 6th conference on Natural language learning - COLING-02

proceeding of the 6th conference on Natural language learning - COLING-02 20 1-7, 2002-01-01

Association for Computational Linguistics (ACL)

詳細情報詳細情報について

CRID: 1870020693211414016

DOI: 10.3115/1118853.1118881

データソース種別

OpenAIRE

書き出し

問題の指摘

Two-dimensional clustering for text categorization

説明

収録刊行物

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について