Classification of Word Sense Disambiguation Errors Using a Clustering Method
-
- Shinnou Hiroyuki
- Department of Computer and Information Sciences, Ibaraki University
-
- Murata Masaki
- Department of Information and Electronics, Tottori University
-
- Shirai Kiyoaki
- School of Information Science, Japan Advanced Institute of Science and Technology
-
- Fukumoto Fumiyo
- Interdisciplinary Graduate School, University of Yamanashi
-
- Fujita Sanae
- NTT Communication Science Laboratories
-
- Sasaki Minoru
- Department of Computer and Information Sciences, Ibaraki University
-
- Komiya Kanako
- Department of Computer and Information Sciences, Ibaraki University
-
- Inui Takashi
- Graduate School of SIE, University of Tsukuba
Bibliographic Information
- Other Title
-
- クラスタリングを利用した語義曖昧性解消の誤り原因のタイプ分け
- クラスタリング オ リヨウ シタ ゴギアイマイセイ カイショウ ノ アヤマリ ゲンイン ノ タイプ ワケ
Search this article
Abstract
As a first step of word sense disambiguation (WSD) errors analysis, generally we need investigate the causes of errors and classify them. For this purpose, seven analysts classified the error data for analysis from their unique standpoints. Next, we attempted to merge the results from the analyses. However, merging these results through discussions was difficult because the results differed significantly. Therefore, we used a clustering method for a certain level of automatic merger. Consequently, we classified WSD errors into nine types, and it turned out that the three main types of errors covers 90% of the total WSD errors. Moreover, we showed that the merged error types represented seven results and was standardized by defining the similarity between two classifications and comparing it with each analysis result.
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 22 (5), 319-362, 2015
The Association for Natural Language Processing
- Tweet
Details 詳細情報について
-
- CRID
- 1390282679453429760
-
- NII Article ID
- 130005131982
-
- NII Book ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 027013360
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
- KAKEN
-
- Abstract License Flag
- Disallowed