書誌事項
- タイトル別名
-
- Efficient Grammar Induction Algorithm with Parse Forests from Real Corpora
- コウブン モリ オ モチイタ ジツ コーパス カラ ノ ダイキボ ナ ブンミャク ジユウ ブンポウ ノ コウソク ガクシュウホウ
この論文をさがす
抄録
The task of inducing grammar structures has received a great deal of attention. The reasons why researchers have studied are different; to use grammar induction as the first stage in building large treebanks or to make up better language models. However, grammar induction has inherent computational complexity. To overcome it, some grammar induction algorithms add new production rules incrementally. They refine the grammar while keeping their computational complexity low. In this paper, we propose a new efficient grammar induction algorithm. Although our algorithm is similar to algorithms which learn a grammar incrementally, our algorithm uses the graphical EM algorithm instead of the Inside-Outside algorithm. We report results of learning experiments in terms of learning speeds. The results show that our algorithm learns a grammar in constant time regardless of the size of the grammar. Since our algorithm decreases syntactic ambiguities in each step, our algorithm reduces required time for learning. This constant-time learning considerably affects learning time for larger grammars. We also reports results of evaluation of criteria to choose nonterminals. Our algorithm refines a grammar based on a nonterminal in each step. Since there can be several criteria to decide which nonterminal is the best, we evaluate them by learning experiments.
収録刊行物
-
- 人工知能学会論文誌
-
人工知能学会論文誌 19 360-367, 2004
一般社団法人 人工知能学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390282680083202944
-
- NII論文ID
- 10014164661
-
- NII書誌ID
- AA11579226
-
- ISSN
- 13468030
- 13460714
-
- NDL書誌ID
- 7264260
-
- 本文言語コード
- ja
-
- データソース種別
-
- JaLC
- NDL
- Crossref
- CiNii Articles
-
- 抄録ライセンスフラグ
- 使用不可