TOPIC EXTRACTION AND SOCIAL PROBLEM DETECTION BASED ON DOCUMENT CLUSTERING
-
- Hashimoto Taiichi
- Tokyo Institute of Technology
-
- Murakami Koji
- Tokyo Institute of Technology
-
- Inui Takashi
- Tokyo Institute of Technology
-
- Utsumi Kazuo
- Tokyo Institute of Technology
-
- Ishikawa Masamichi
- Tokyo Institute of Technology
Bibliographic Information
- Other Title
-
- 文書クラスタリングによるトピック抽出および課題発見
- ブンショ クラスタリング ニ ヨル トピック チュウシュツ オヨビ カダイ ハッケン
Search this article
Abstract
The method that enabled to extract important topics from document clusters containing text documents of many subjects retrieved from Nikkei newspaper was developed. The hierarchical clustering algorithm, UPGMA was used to generate the tree structure of clusters according to the similarity of document vectors defined by noun words appeared in the documents. The document clustering revealed the intimate relationship with the process of the societal problem detection, classifying similar documents in each topical group and structuring the groups according to their contents. The method was evaluated by applying to the subject of the organizational hazards caused by Japanese industries during 1990-2005.
Journal
-
- SOCIOTECHNICA
-
SOCIOTECHNICA 5 216-226, 2008
Sociotechnology Research Network