Character Soptting of Historical Documents Using Pattern Segmentation Aided by Recognition Processing

Umeda Michio, Hashimoto Tomohiro

doi:10.1541/ieejeiss1987.122.11_1876

This paper proposes a character segmentation and spotting method of historical documents. In the segmentation method, the result of character recognition process is utilized to cope with the cursive scripts and the mutual encroachment of characters which are peculiar to the historical documents. In the spotting method, the previously designated characters are only extracted from the characters string. As an early segmentation, the characters string pattern is divided into the same connected component by using the labelling processing. The area composed of the same component is surrounded with a rectangle and each character pattern is segmented each other by using the shape of rectangle such as height and width. Next, the individual character recognition is applied to the segmented pattern. From the recognition result, the rectangle failed in the segmentation is picked up and the resegmentation is applied to the string contains this rectangle. Therefore, it is expected that the string is divided at the best position. On the other hand the neural network which corresponds to the previously designated character is prepared. The error between input and output of the network applied to the segmented pattern is calculated and the pattern which satisfies the condition is extracted as a spotting result. From the extraction experiment applied to 615 characters strings, the correct spotting rate of 94.22% was obtained to 5 designated characters by using the resegmentation process, but the rate was 87.58% without the resegmentation process.

Character Soptting of Historical Documents Using Pattern Segmentation Aided by Recognition Processing

Bibliographic Information

Search this article

Description

Journal

Citations (3)*help

References(10)*help

Related Projects

Keywords

Details 詳細情報について

Export

Report a problem