Correcting misuse of Japanese visually similar characters

説明

We present a misuse correction method of visually similar Japanese characters, Kanji, based on the language model. While methods for error correction in Japanese learners' writings have been proposed, however the misuse of visually similar Kanji has not been explored yet. We collected pairs or groups of visually similar Kanji and created the similar Kanji set. Then, candidate sentences are generated by replacing the misuse Kanji with similar Kanji extracted from the similar Kanji set, and select the candidate with the highest language model probability. The experimental results suggest that our method showed high performance in many cases of misuse. In addition, using a morphological analyzer, we developed an unknown word filter which excludes candidates that constitute unknown words when generating candidates. We have found that this filter is effective to prevent erroneous corrections.

収録刊行物

詳細情報 詳細情報について

問題の指摘

ページトップへ