Monolingual Phrase Alignment Based on Word Embedding
-
- Yoshinaka Masato
- Graduate School of Information Science and Technology, Osaka University
-
- Kajiwara Tomoyuki
- Graduate School of Science and Engineering, Ehime University
-
- Arase Yuki
- Graduate School of Information Science and Technology, Osaka University
Bibliographic Information
- Other Title
-
- 単語分散表現に基づく単一言語内フレーズアラインメント
Abstract
<p> We present a word embedding-based monolingual phrase aligner. In monolingual phrase alignment, an aligner identifies the set of phrasal paraphrases in a sentence pair. Previous methods required large-scale lexica or high-quality parsers. Consequently, applying them to languages other than English is difficult. Unlike them, the proposed method uses only a pre-trained word embedding model, and thus it relies solely on raw monolingual corpora. Our method yields word alignments using pre-trained word embedding and then extends them to phrase alignments using a heuristic approach. Then, it composes a phrase representation from word embedding and searches for a set of consistent phrase alignments on a lattice of phrase alignment candidates. The experimental results in this study on the English dataset show that our method outperforms the previous phrase aligner. We also constructed a Japanese dataset for analysis, confirming that our method works with languages other than English.</p>
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 28 (2), 508-531, 2021
The Association for Natural Language Processing
- Tweet
Keywords
Details 詳細情報について
-
- CRID
- 1390569845479543680
-
- NII Article ID
- 130008052584
-
- ISSN
- 21858314
- 13407619
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- Crossref
- CiNii Articles
-
- Abstract License Flag
- Disallowed