Currently, the destruction of the sequence structure in handwritten text has become one of the main bottlenecks restricting the recognition task. The typical situations include additional specific markers (the text swapping modification) and the text overlap caused by character modifications like deletion, replacement, and insertion. In this paper, we propose a two-stage detection algorithm that combines structure knowledge and deep models for the above mentioned text. Firstly, different structure prototypes are roughly located from handwritten text images. Based on the detection results of the first stage, in the second stage, we adopt different strategies. Specifically, a shape regression network trained by a novel semi-supervised contrast training strategy is introduced and the positional relationship between the characters is fully employed. Experiments on two handwritten text datasets show that the proposed method can greatly improve the detection performance. The new dataset is available at https://github.com/Wukong90.
翻译:暂无翻译