Automatic Term Recognition Using the Corpora of the Different Academic Areas

  • KUBO Junko
    Graduate School of Library, Information and Media Studies, University of Tsukuba
  • TSUJI Keita
    Graduate School of Library, Information and Media Studies, University of Tsukuba
  • SUGIMOTO Shigeo
    Graduate School of Library, Information and Media Studies, University of Tsukuba

Bibliographic Information

Other Title
  • 異なる学問分野のコーパスを利用した専門用語抽出手法の提案
  • コトナル ガクモン ブンヤ ノ コーパス オ リヨウ シタ センモン ヨウゴ チュウシュツ シュホウ ノ テイアン

Search this article

Description

In this paper, we propose a method for automatic term recognition (ATR) which is using the statistical differences of relative frequencies of terms in target domain corpus and in others. The target terms more frequently appear in target domain corpus than in other domain corpus. Utilizing such characteristics will lead to the improvement of extraction performance. Most of the ATR methods proposed so far only use the target domain corpus and do not take such characteristics into account. For the extraction experiment, we used the abstracts of the Women's Studies International Forum as a target domain corpus and those of academic journals of 39 domains as non-target domain corpus. The extraction performance was examined and we found that our method outperformed the existing ATR methods. We confirmed that it is possible to decrease the size of the other domain corpus by the experiments which used random journals out of 39 domains. As a result, we found that we used some corpus consists of journals which is similar to target domain is almost as high extraction performance as the corpus consists of 39 journals.

Journal

References(34)*help

See more

Details 詳細情報について

Report a problem

Back to top