Deep hashing has been widely applied in large-scale data retrieval due to its superior retrieval efficiency and low storage cost. However, data are often scattered in data silos with privacy concerns, so performing centralized data storage and retrieval is not always possible. Leveraging the concept of federated learning (FL) to perform deep hashing is a recent research trend. However, existing frameworks mostly rely on the aggregation of the local deep hashing models, which are trained by performing similarity learning with local skewed data only. Therefore, they cannot work well for non-IID clients in a real federated environment. To overcome these challenges, we propose a novel federated hashing framework that enables participating clients to jointly train the shared deep hashing model by leveraging the prototypical hash codes for each class. Globally, the transmission of global prototypes with only one prototypical hash code per class will minimize the impact of communication cost and privacy risk. Locally, the use of global prototypes are maximized by jointly training a discriminator network and the local hashing network. Extensive experiments on benchmark datasets are conducted to demonstrate that our method can significantly improve the performance of the deep hashing model in the federated environments with non-IID data distributions.
翻译:由于检索效率较高,储存成本低,在大规模数据检索中广泛采用了深散法,但数据往往分散在数据分类库中,存在隐私问题,因此并不总是可能进行集中的数据储存和检索。利用联合学习的概念来进行深度散列是一个最近的研究趋势。然而,现有框架主要依靠当地深度散列模型的集成,这些模型仅经过与本地偏斜数据进行类似学习的培训,因此,这些数据往往分散在数据分类库中,对非IID客户在真实的联邦环境中无法很好地发挥作用。为了克服这些挑战,我们提议一个新的联合散列散列框架,使参与的客户能够联合培训共同的深度散列模型,利用每个类的原型散列散列代码。在全球范围内,仅使用一种原型散列散列代码传输全球原型将最大限度地减少通信成本和隐私风险的影响。在当地,通过联合培训一个歧视网络和本地散列网络,全球原型模型的使用将达到最大化。为了克服这些挑战,我们共同进行广泛的基准数据集实验,通过利用每类使用一种原型散列的原型代码来联合进行联合测试,从而改进我们的深层环境。