Optical character recognition (OCR) technology has been widely used in various scenes, as shown in Figure 1. Designing a practical OCR system is still a meaningful but challenging task. In previous work, considering the efficiency and accuracy, we proposed a practical ultra lightweight OCR system (PP-OCR), and an optimized version PP-OCRv2. In order to further improve the performance of PP-OCRv2, a more robust OCR system PP-OCRv3 is proposed in this paper. PP-OCRv3 upgrades the text detection model and text recognition model in 9 aspects based on PP-OCRv2. For text detector, we introduce a PAN module with large receptive field named LK-PAN, a FPN module with residual attention mechanism named RSE-FPN, and DML distillation strategy. For text recognizer, the base model is replaced from CRNN to SVTR, and we introduce lightweight text recognition network SVTR LCNet, guided training of CTC by attention, data augmentation strategy TextConAug, better pre-trained model by self-supervised TextRotNet, UDML, and UIM to accelerate the model and improve the effect. Experiments on real data show that the hmean of PP-OCRv3 is 5% higher than PP-OCRv2 under comparable inference speed. All the above mentioned models are open-sourced and the code is available in the GitHub repository PaddleOCR which is powered by PaddlePaddle.
翻译:如图1所示,在不同场景广泛使用光学字符识别技术(OCR),如图1所示,设计实用的OCR系统仍是一项有意义但具有挑战性的任务。在以往的工作中,考虑到效率和准确性,我们建议采用实用超轻量光度OCR系统(PP-OCR)和优化版PPP-OCRv2。为了进一步改善PP-OCRV2的性能,本文件建议采用更强大的OCR系统P-OCRv3。PP-OCR3在PP-OCR2的基础上更新了9个方面的文本检测模型和文本识别模型。对于文本检测器而言,我们采用了一个PAN模块,其大接收场名为LK-PAN(LK-PAN)、FPN(FPS-FPN)和DML(DR)的残余关注机制(PPPPP-F)和优化版的蒸馏战略。对于文本识别器而言,SVTR LCNet的轻度文本识别网络,通过注意、数据增强战略、改进前测试模型模型和加速的模型,在SBRMUPDRS-S-S-S-S-S-S-S-SUDRA中显示S-S-S-S-S-S-S-S-SUDRUDR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SUDR-S-S-S-S-SUDR-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-SD-S-S-SL-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-