- 【Updated on May 12, 2025】 Integration of CiNii Dissertations and CiNii Books into CiNii Research
- Trial version of CiNii Research Knowledge Graph Search feature is available on CiNii Labs
- 【Updated on June 30, 2025】Suspension and deletion of data provided by Nikkei BP
- Regarding the recording of “Research Data” and “Evidence Data”
Unsupervised Word Alignment Using Frequency Constraint in Posterior Regularized EM
-
- Kamigaito Hidetaka
- Department of Computational Intelligence and Systems Science, Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology
-
- Watanabe Taro
- Google Inc. This work was mostly done while the second author was affiliated with National Institute of Information and Communications Technology
-
- Takamura Hiroya
- Tokyo Institute of Technology, Precision and Intelligence Laboratory
-
- Okumura Manabu
- Tokyo Institute of Technology, Precision and Intelligence Laboratory
-
- Sumita Eiichiro
- National Institute of Information and Communication Technology
Search this article
Description
<p> Generative word alignment models, such as IBM Models, are restricted to one-to-many alignment, and cannot explicitly represent many-to-many relationships in bilingual texts. The problem is partially solved either by introducing heuristics or by agreement constraints such that two directional word alignments agree with each other. However, this constraint cannot take into account the grammatical difference of language pairs. In particular, function words are not trivial to align for grammatically different language pairs, such as Japanese and English. In this paper, we focus on the posterior regularization framework (Ganchev, Graca, Gillenwater, and Taskar 2010) that can force two directional word alignment models to agree with each other during training, and propose new constraints that can take into account the difference between function words and content words. We discriminate a function word and a content word using word frequency in the same way as done by Setiawan, Kan, andLi (2007). Experimental results show that our proposed constraints achieved better alignment qualities on the French-English Hansard task and the Japanese-English Kyoto free translation task (KFTT) measured by AER and F-measure. In translation evaluations, we achieved statistically significant gains in BLEU scores in the Japanese-English NTCIR10 task and Spanish-English WMT06 task. </p>
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 23 (4), 327-351, 2016
The Association for Natural Language Processing
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390282679452872832
-
- NII Article ID
- 130005250396
-
- NII Book ID
- AN10472659
-
- ISSN
- 21858314
- 13407619
-
- NDL BIB ID
- 027651769
-
- Text Lang
- en
-
- Data Source
-
- JaLC
- NDL Search
- Crossref
- CiNii Articles
- OpenAIRE
-
- Abstract License Flag
- Disallowed