Sequential Morphological Analysis of Hiragana Strings using Recurrent Neural Network and Logistic Regression
-
- Moriyama Shuhei
- Idein Inc.
-
- Ohno Tomohiro
- Tokyo Denki University
Bibliographic Information
- Other Title
-
- RNN とロジスティック回帰を用いた平仮名文の逐次的な形態素解析
Description
<p>This paper describes the morphological analysis of unsegmented Hiragana strings. It is known that Hiragana strings have more ambiguities than Kanji-Kana mixed strings. Certain morphological analysis methods have been developed mainly for Hiragana strings, but most have not obtained sufficient analysis accuracy. The accuracy of a prior method is higher than that of the famous conventional morphological analysis tool for Kanji-Kana mixed strings, but the prior method has the problem in that it requires considerable amount of analysis time. Aiming for high-accuracy and practical-speed analysis of unsegmented Hiragana strings, we propose a sequential morphological analysis method using RNN (Recurrent Neural Network) and logistic regression. To speed up the analysis, the proposed method sequentially estimates word boundaries for each character boundary and estimates morpheme information for each word. To improve the accuracy of the analysis, the proposed method estimates word boundaries and morpheme information by integrating the estimation based on local information by logistic regression and the estimation based on global information by RNN. The experimental results confirmed that the proposed method achieved a speed-up of more than 100 times and a higher analysis accuracy than that of the prior method. </p>
Journal
-
- Journal of Natural Language Processing
-
Journal of Natural Language Processing 29 (2), 367-394, 2022
The Association for Natural Language Processing
- Tweet
Details 詳細情報について
-
- CRID
- 1390292406081641088
-
- ISSN
- 21858314
- 13407619
-
- Text Lang
- ja
-
- Data Source
-
- JaLC
- Crossref
- OpenAIRE
-
- Abstract License Flag
- Disallowed