Monolingual Phrase Alignment Based on Word Embedding

  • Yoshinaka Masato
    Graduate School of Information Science and Technology, Osaka University
  • Kajiwara Tomoyuki
    Graduate School of Science and Engineering, Ehime University
  • Arase Yuki
    Graduate School of Information Science and Technology, Osaka University

Bibliographic Information

Other Title
  • 単語分散表現に基づく単一言語内フレーズアラインメント

Abstract

<p> We present a word embedding-based monolingual phrase aligner. In monolingual phrase alignment, an aligner identifies the set of phrasal paraphrases in a sentence pair. Previous methods required large-scale lexica or high-quality parsers. Consequently, applying them to languages other than English is difficult. Unlike them, the proposed method uses only a pre-trained word embedding model, and thus it relies solely on raw monolingual corpora. Our method yields word alignments using pre-trained word embedding and then extends them to phrase alignments using a heuristic approach. Then, it composes a phrase representation from word embedding and searches for a set of consistent phrase alignments on a lattice of phrase alignment candidates. The experimental results in this study on the English dataset show that our method outperforms the previous phrase aligner. We also constructed a Japanese dataset for analysis, confirming that our method works with languages other than English.</p>

Journal

References(20)*help

See more

Keywords

Details 詳細情報について

Report a problem

Back to top