SMT with Handmade Phrase Table
-
- Murakami Jin’ichi
- Faculty of Engineering, Tottori University
-
- Kagami Ryouta
- Faculty of Engineering, Tottori University
-
- Tokuhisa Masato
- Faculty of Engineering, Tottori University
-
- Ikehara Satoru
- Faculty of Engineering, Tottori University
Bibliographic Information
- Other Title
-
- 統計翻訳における人手で作成された<BR> 大規模フレーズテーブルの効果
- 統計翻訳における人手で作成された大規模フレーズテーブルの効果
- トウケイ ホンヤク ニ オケル ヒトデ デ サクセイ サレタ ダイキボ フレーズテーブル ノ コウカ
Search this article
Abstract
Recently, the statistical machine translation (SMT) method is very popular for machine translation. This SMT method uses an automatically calculated translation model and language model for large translation pair sentences. The translation model provides the probability that the foreign string is the translation of the native string and is normally controlled using a phrase table. However, the phrase table is automatically made; it has high coverage but low reliability. On the other side, there are many translation word pairs made by hand, especially in Japanese English translation. These translation word pairs have low coverage but high reliability. Therefore, we added these handmade translation word pairs into the automatically made phrase table. In this paper, we used 130,000 translation word pairs and the phrase table with added word pairs. As a result of the experiments, we obtained a BLUE score of 13.4% for simple sentences and 8.5% for complex sentences. On the other side, with the base line system, the score was 12.5% for simple sentences and 7.7% for complex sentences. We also studied an ABX test. In simple sentences, 5 sentences were good using the base line, and 23 sentences were good using the proposed method. In complex sentences, 15 sentences were good using the base line, and 35 sentences were good using the proposed method. As a result of these experiments, the effectiveness of the proposed method was shown.
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 17 (4), 4_155-4_175, 2010
The Association for Natural Language Processing
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390001204476357632
-
- NII Article ID
- 10027016541
- 130004566406
-
- NII Book ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 10772055
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- NDL
- Crossref
- CiNii Articles
- KAKEN
-
- Abstract License Flag
- Disallowed