複合語の分野連想語の効率的決定法

書誌事項

タイトル別名
  • An Efficient Method of Determining Field Association Terms of Compound Words
  • フクゴウゴ ノ ブンヤ レンソウゴ ノ コウリツテキ ケッテイホウ

この論文をさがす

抄録

Although there are many kinds of research about text classification based on term information in the whole text, humans can recognize the field of a text by finding a small number of specific words in it. In this paper, such terms are called a field association (FA) term that can be directly related to the field of a text. It is possible to collect single-word FA terms because the number is finite, but there are some difficulties: how to select useful compound FA terms from a huge number of combinations of single-word FA terms. For FA terms, five association levels are defined and two kinds of ranks based on stability and inheritance are presented. Redundant candidates of compound FA terms can be removed remarkably by using the level and the rank. From the simulation results of 180 fields' Japanese text files, it turns out that the total number 88, 782 of candidates for compound FA terms can be reduced to 8, 405 which is about 9% to the original and that recall and precision are more than 0.77 and 0.90, respectively. From the experimental results of field determination using FA terms for 264 fragments of texts, it is shown that the accuracy by the presented method attains more than 90%, and that is about 30% higher than the case where only single-word FA terms are used.

収録刊行物

  • 自然言語処理

    自然言語処理 7 (2), 3-26, 2000

    一般社団法人 言語処理学会

被引用文献 (2)*注記

もっと見る

参考文献 (33)*注記

もっと見る

詳細情報 詳細情報について

問題の指摘

ページトップへ