ハブの抑制によるコンパラブルコーパスからの対訳抽出精度の改善

重藤 優太郎, 鈴木 郁美, 原 一夫, 新保 仁, 松本 裕治

doi:10.1527/tjsai.e-f43

書誌事項

タイトル別名

Reducing Hub Translation Candidates Improves the Accuracy of Bilingual Lexicon Extraction from Comparable Corpora

公開日: 2016

資源種別: journal article

DOI

10.1527/tjsai.e-f43

公開者: 一般社団法人人工知能学会

説明

Most of the existing approaches to bilingual lexicon extraction (BLE) first map words in source and target languages into a single vector space, and then measure the similarity of words across the two languages in this space. We point out that existing BLE methods suffer from the so-called hubness phenomenon; i.e., a small number of translation candidates (hub candidates) are chosen by the systems as likely translations of many source words, which consequently degrade the accuracy of extracted translations. We show that this phenomenon can be alleviated by centering the data or by using the mutual proximity measure, which are two known techniques that effectively reduce hubness in standard nearest-neighbor search settings. Our empirical evaluation shows that naive nearest-neighbor search combined with these methods outperforms a recently proposed BLE method based on label propagation.

収録刊行物

人工知能学会論文誌

人工知能学会論文誌 31 (2), E-F43_1-12, 2016

一般社団法人人工知能学会

キーワード

詳細情報詳細情報について

CRID: 1390001205106959872

NII論文ID: 130005126835

DOI: 10.1527/tjsai.e-f43

ISSN: 13468030; 13460714

Web Site: https://www.jstage.jst.go.jp/article/tjsai/31/2/31_E-F43/_pdf

本文言語コード: ja

資料種別: journal article

データソース種別

JaLC
Crossref
CiNii Articles
KAKEN
OpenAIRE

抄録ライセンスフラグ: 使用不可

書き出し

問題の指摘

ハブの抑制によるコンパラブルコーパスからの対訳抽出精度の改善

書誌事項

この論文をさがす

説明

収録刊行物

参考文献 (10)*注記

関連プロジェクト

キーワード

詳細情報詳細情報について

書き出し

問題の指摘

ハブの抑制によるコンパラブルコーパスからの対訳抽出精度の改善

書誌事項

この論文をさがす

説明

収録刊行物

参考文献 (10)*注記

関連プロジェクト

キーワード

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について