While scene text recognition techniques have been widely used in commercial applications, data privacy has rarely been taken into account by this research community. Most existing algorithms have assumed a set of shared or centralized training data. However, in practice, data may be distributed on different local devices that can not be centralized to share due to the privacy restrictions. In this paper, we study how to make use of decentralized datasets for training a robust scene text recognizer while keeping them stay on local devices. To the best of our knowledge, we propose the first framework leveraging federated learning for scene text recognition, which is trained with decentralized datasets collaboratively. Hence we name it FedOCR. To make FedCOR fairly suitable to be deployed on end devices, we make two improvements including using lightweight models and hashing techniques. We argue that both are crucial for FedOCR in terms of the communication efficiency of federated learning. The simulations on decentralized datasets show that the proposed FedOCR achieves competitive results to the models that are trained with centralized data, with fewer communication costs and higher-level privacy-preserving.
翻译:虽然在商业应用中广泛使用了现场文本识别技术,但这一研究界很少考虑到数据隐私,大多数现有的算法假定了一套共享或集中的培训数据,但在实践中,数据可能分布在由于隐私限制而无法集中分享的不同地方装置上。在本文中,我们研究如何利用分散的数据集培训稳健现场文本识别器,同时将其留在当地装置上。我们最了解的是,我们提议第一个框架,在现场文本识别方面利用联合学习进行现场文本识别,通过分散的数据集进行协作培训。因此,我们把它命名为FedOCR。为了使FedCOR在终端装置上部署的合适性,我们做了两项改进,包括使用轻量模型和散装技术。我们争辩说,在节能学习的通信效率方面,这两种都对FDEOCR都至关重要。关于分散的数据集的模拟表明,拟议的FDEOCR在以集中数据培训的模型上取得了竞争性的结果,通信成本较低,而且更高级别的保密性也很高。