Page segmentation based on thinning of background

Description

This paper presents a new method of page segmentation based on the analysis of background (white areas). The proposed method is capable of segmenting pages with non-rectangular layout as well as with various angles of skew. The characteristics of the method are as follows: (1) thinning of the background enables us to represent white areas of any shape as connected thin lines or chains and the robustness for tilted page images is also achieved by the representation; and (2) based on this representation, the task of page segmentation is defined as to find the loops enclosing printed areas. The task is achieved by eliminating unnecessary chains using not only a feature of white areas, but also a feature of black areas divided by a chain. Based on the experimental results and the comparison with previous methods, we discuss the advantages and limitations of the proposed method.

Journal

Details 詳細情報について

Report a problem

Back to top