Encoder-Decoder Attention ≠ Word Alignment: Axiomatic Method of Learning Word Alignments for Neural Machine Translation

Ma Chunpeng, Tamura Akihiro, Utiyama Masao, Zhao Tiejun, Sumita Eiichiro

doi:10.5715/jnlp.27.531

この論文をさがす

説明

<p>The encoder-decoder attention matrix has been regarded as the (soft) alignment model for conventional neural machine translation (NMT) models such as RNN-based models. However, we show empirically that this is not true for the Transformer. By comparing the Transformer with the RNN-based NMT model, we find two inherent differences, and accordingly present two methods of capturing word alignments in the Transformer. Furthermore, instead of focusing on the Transformer, we present three axioms for the attention mechanism that captures word alignments, and propose a new attention mechanism based on these axioms that we have termed the axiomatic attention mechanism (AAM), and which is applicable to any NMT models. The AAM depends on a perturbation function, and we apply several perturbation functions to the AAM, including a novel function based on the masked language model (Devlin, Chang, Lee, and Toutanova 2019). Using the AAM to guide the training of an NMT model improved both the translation performance and the learning of word alignments of the NMT model. Our research sheds light on the interpretation of sequence-to-sequence models on neural machine translation. </p>

収録刊行物

自然言語処理

自然言語処理 27 (3), 531-552, 2020-09-15

一般社団法人　言語処理学会

キーワード

詳細情報詳細情報について

CRID: 1390568456336515584

NII論文ID: 130007956043

NII書誌ID: AN10472659

DOI: 10.5715/jnlp.27.531

ISSN: 21858314; 13407619

NDL書誌ID: 030660404

Web Site: http://id.ndl.go.jp/bib/030660404; https://ndlsearch.ndl.go.jp/books/R000000004-I030660404; https://www.jstage.jst.go.jp/article/jnlp/27/3/27_531/_pdf

本文言語コード: en

データソース種別

JaLC
NDLサーチ
Crossref
CiNii Articles
KAKEN
OpenAIRE

抄録ライセンスフラグ: 使用不可

書き出し

問題の指摘

Encoder-Decoder Attention ≠ Word Alignment: Axiomatic Method of Learning Word Alignments for Neural Machine Translation

この論文をさがす

説明

収録刊行物

被引用文献 (2)*注記

参考文献 (16)*注記

関連プロジェクト

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

Encoder-Decoder Attention ≠ Word Alignment: Axiomatic Method of Learning Word Alignments for Neural Machine Translation

この論文をさがす

説明

収録刊行物

被引用文献 (2)*注記

参考文献 (16)*注記

関連プロジェクト

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について