パラレルコーパスから自動獲得した用例に基づく語義曖昧性解消

Pulkit Kathuria, 白井 清昭

本論文では，日本語学習者向けの読解支援システムで用いることを前提とし，精度を重視した用例に基づく語義曖昧性解消(WSD)手法について述べる．提案手法では，コロケーションと統語的関係の2つの観点から文の類似度を測り，辞書中の用例の中から最も似ていてかつ類似度が十分高い用例の語義を選択する．再現率を向上させるため，用例に基づくWSD手法はNaive Bayesモデルと組み合わせて用いる．また，パラレルコーパスから語義ごとに例文を獲得し，用例データベースを拡張することでWSDの性能を向上させる．実験の結果，コーパスから自動獲得された例文の正解率は85%であった．また，提案手法のWSDの正解率は65%であり，ベースラインから7%の改善が見られた．This paper presents a precision oriented example based approach for word sense disambiguation (WSD) for a reading assistant system for Japanese learners. Our WSD classifier chooses a sense associated with the most similar sentence in a dictionary only if the similarity is high enough, otherwise chooses no sense. We propose sentence similarity measures by exploiting collocations and syntactic dependency relations for a target word. The example based classifier is combined with Naive Bayes model to compensate recall. We further improve WSD performance by automatically acquiring bilingual sentences from a parallel corpus. According to the results of our experiments, the accuracy of automatically extracted sentences was 85%, while the proposed WSD method achieves 65% precision which is 7% higher than the baseline.

パラレルコーパスから自動獲得した用例に基づく語義曖昧性解消

書誌事項

この論文をさがす

説明

収録刊行物

詳細情報詳細情報について

書き出し

問題の指摘

パラレルコーパスから自動獲得した用例に基づく語義曖昧性解消

書誌事項

この論文をさがす

説明

収録刊行物

詳細情報 詳細情報について

書き出し

問題の指摘

詳細情報詳細情報について