In this paper, we show that the process of continually learning new tasks and memorizing previous tasks introduces unknown privacy risks and challenges to bound the privacy loss. Based upon this, we introduce a formal definition of Lifelong DP, in which the participation of any data tuples in the training set of any tasks is protected, under a consistently bounded DP protection, given a growing stream of tasks. A consistently bounded DP means having only one fixed value of the DP privacy budget, regardless of the number of tasks. To preserve Lifelong DP, we propose a scalable and heterogeneous algorithm, called L2DP-ML with a streaming batch training, to efficiently train and continue releasing new versions of an L2M model, given the heterogeneity in terms of data sizes and the training order of tasks, without affecting DP protection of the private training set. An end-to-end theoretical analysis and thorough evaluations show that our mechanism is significantly better than baseline approaches in preserving Lifelong DP. The implementation of L2DP-ML is available at: https://github.com/haiphanNJIT/PrivateDeepLearning.
翻译:在本文中,我们表明,不断学习新任务并记住以往任务的过程带来了未知的隐私风险和挑战,从而约束隐私损失。在此基础上,我们引入了终身DP的正式定义,根据这一定义,任何数据库参与任何任务的培训组合都受到保护,这种参与始终受到约束的DP保护,因为任务层层层不断扩大,任务层层层不断扩大。始终相互约束的DP意味着DP隐私预算只有一个固定价值,而不管任务数量多寡。为了保护终身DP,我们提议一种可扩展的和多种的算法,称为L2DP-ML,配有分批培训,以便有效培训和继续发布L2M模式的新版本,同时考虑到数据规模和培训任务顺序的多样性,同时不影响DP对私人培训组合的保护。结束式理论分析和彻底评估表明,我们的机制大大优于保护终身DP的基线方法。L2DP-ML的实施可在以下网址上查到:https://github.com/haifanNJIT/Prei DeepleLest。