ニューラル機械翻訳を使った中国語古文の翻訳  －訓練・評価時の時間的差異の検証

段, 文傑, 王, 鴻飛, 岡, 照晃, 小町, 守, 古宮, 嘉那子, Wenjie, Duan  Hongfei Wang  Teruaki Oka  Mamoru Komachi  Kanako Komiya

時代の経過とともに，単語や文法などが変化する可能性が高い．そのため，訓練データと評価データの時間的差異が大きくなると中国語古文の翻訳モデルの性能が低下するという仮説を立てた．そこで，本論文では，異なる時代スパンの中国語パラレルコーパスを使用してニューラル機械翻訳モデルを訓練し，古文から現代文への翻訳性能を調査した．また，事前学習済みモデルをコンテキスト埋め込みとして使用する有効性について考察した．調査の結果，訓練データと評価データの時代が遠いほど翻訳モデルの性能が下がることが分かった．また，今回の事前学習済みモデルの使用方法では翻訳性能を改善できなかった．

As time passes, words and grammar are likely to change. Therefore, the performance of translation models is believed to decrease when there are significant temporal differences between the training and testing data. In this paper, we trained neural machine translation models using Chinese parallel corpora from different time periods and investigated their performance in translating ancient Chinese to modern Chinese. Additionally, we discussed the effectiveness of using a pre-trained model as contextual embeddings. The results of investigation indicated that the further apart the time periods of the training and testing data, the lower the performance of the translation models. Furthermore, the use of the pre-trained model in this study did not lead to an improvement in translation performance.

ニューラル機械翻訳を使った中国語古文の翻訳－訓練・評価時の時間的差異の検証

書誌事項

説明

収録刊行物

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

ニューラル機械翻訳を使った中国語古文の翻訳 －訓練・評価時の時間的差異の検証