A Method of Extracting Related Words Using Standardized Mutual Information

説明

Techniques of automatic extraction of related words are of great importance in many applications such as query expansion and automatic thesaurus construction. In this paper, a method of extracting related words is proposed basing on the statistical information about the co-occurrences of words from huge corpora. The mutual information is one of such statistical measures and has been used for application mainly in natural language processing. A drawback is, however, the mutual information depends mainly on frequencies of words. To overcome this difficulty, we propose as a new measure a normalize deviation of mutual information. We also reveal a correspondence between word ambiguity and related words using word relation graphs constructed using this measure.

詳細情報 詳細情報について

問題の指摘

ページトップへ