書誌事項
- タイトル別名
-
- Reducing Hub Translation Candidates Improves the Accuracy of Bilingual Lexicon Extraction from Comparable Corpora
- 公開日
- 2016
- 資源種別
- journal article
- DOI
-
- 10.1527/tjsai.e-f43
- 公開者
- 一般社団法人 人工知能学会
この論文をさがす
説明
Most of the existing approaches to bilingual lexicon extraction (BLE) first map words in source and target languages into a single vector space, and then measure the similarity of words across the two languages in this space. We point out that existing BLE methods suffer from the so-called hubness phenomenon; i.e., a small number of translation candidates (hub candidates) are chosen by the systems as likely translations of many source words, which consequently degrade the accuracy of extracted translations. We show that this phenomenon can be alleviated by centering the data or by using the mutual proximity measure, which are two known techniques that effectively reduce hubness in standard nearest-neighbor search settings. Our empirical evaluation shows that naive nearest-neighbor search combined with these methods outperforms a recently proposed BLE method based on label propagation.
収録刊行物
-
- 人工知能学会論文誌
-
人工知能学会論文誌 31 (2), E-F43_1-12, 2016
一般社団法人 人工知能学会
- Tweet
詳細情報 詳細情報について
-
- CRID
- 1390001205106959872
-
- NII論文ID
- 130005126835
-
- ISSN
- 13468030
- 13460714
-
- 本文言語コード
- ja
-
- 資料種別
- journal article
-
- データソース種別
-
- JaLC
- Crossref
- CiNii Articles
- KAKEN
- OpenAIRE
-
- 抄録ライセンスフラグ
- 使用不可

