書誌事項
- タイトル別名
-
- The Application of Decision Trees to Segmentation of Long Japanese Sentences
- ケッテイギ ニ ヨル ニホンゴ チョウブン ノ タンブン ブンカツ
この論文をさがす
抄録
It is well known that direct parsing of a long Japanese sentence, including many conjunctive clauses, is extremely difficult. Therefore, it is preferable to segment such a sentence into shorter, simpler ones prior to parsing. Some methods for sentence segmentation have been reported so far. However, because those conventional methods are based on handmade segmentation patterns or rules, they have problems in keeping consistency of the patterns, and in deciding the optimal order of applying those rules. This paper proposes a new method of sentence segmentation using a decision tree, which acquires optimal segmentation patterns and the optimal order of their application automatically from a corpus, taking both linguistic phenomena and their occurrence frequencies into account. Generation and evaluation of a decision tree for sentence segmentation were conducted on an EDR corpus. For 400 evaluation sentences, precision and recall were both 84%, and the percentage of correctly segmented sentences was 77%. It was also confirmed that pruning reduces the tree size significantly without deteriorating the performance.
収録刊行物
-
- 自然言語処理
-
自然言語処理 7 (1), 13-30, 2000
一般社団法人 言語処理学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390282679451175296
-
- NII論文ID
- 10021991375
-
- NII書誌ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL書誌ID
- 4962088
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- NDL
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可