We study the question of how well machine learning (ML) models trained on a certain data set provide privacy for the training data, or equivalently, whether it is possible to reverse-engineer the training data from a given ML model. While this is easy to answer negatively in the most general case, it is interesting to note that the protection extends over non-recoverability towards plausible deniability: Given an ML model $f$, we show that one can take a set of purely random training data, and from this define a suitable ``learning rule'' that will produce a ML model that is exactly $f$. Thus, any speculation about which data has been used to train $f$ is deniable upon the claim that any other data could have led to the same results. We corroborate our theoretical finding with practical examples, and open source implementations of how to find the learning rules for a chosen set of raining data.
翻译:我们研究了在某一数据集上受过训练的机器学习模型如何为培训数据提供隐私的问题,或同等地研究是否有可能逆向地从某个模型中设计培训数据的问题,尽管在最一般的情况下,这很容易得到负面的答复,但有趣的是,这种保护超越了无法恢复的可能性,使之走向可信的可免责性:鉴于ML模型的美元,我们表明,人们可以使用一套纯粹随机的培训数据,从中定义了适当的“学习规则”,它将产生一个完全为美元的模式。因此,如果声称任何其他数据都可能导致同样的结果,那么关于用哪些数据来培训美元的任何猜测都是可以排除的。我们用实际实例和公开来源的实施来证实我们的理论发现,即如何为选定的一组雨量数据找到学习规则。