Sequential Morphological Analysis of Hiragana Strings using Recurrent Neural Network and Logistic Regression

Bibliographic Information

Other Title
  • RNN とロジスティック回帰を用いた平仮名文の逐次的な形態素解析

Description

<p>This paper describes the morphological analysis of unsegmented Hiragana strings. It is known that Hiragana strings have more ambiguities than Kanji-Kana mixed strings. Certain morphological analysis methods have been developed mainly for Hiragana strings, but most have not obtained sufficient analysis accuracy. The accuracy of a prior method is higher than that of the famous conventional morphological analysis tool for Kanji-Kana mixed strings, but the prior method has the problem in that it requires considerable amount of analysis time. Aiming for high-accuracy and practical-speed analysis of unsegmented Hiragana strings, we propose a sequential morphological analysis method using RNN (Recurrent Neural Network) and logistic regression. To speed up the analysis, the proposed method sequentially estimates word boundaries for each character boundary and estimates morpheme information for each word. To improve the accuracy of the analysis, the proposed method estimates word boundaries and morpheme information by integrating the estimation based on local information by logistic regression and the estimation based on global information by RNN. The experimental results confirmed that the proposed method achieved a speed-up of more than 100 times and a higher analysis accuracy than that of the prior method. </p>

Journal

References(9)*help

See more

Details 詳細情報について

Report a problem

Back to top