We address the challenge of solving machine learning tasks using data from privacy-sensitive sellers. Since the data is private, we design a data market that incentivizes sellers to provide their data in exchange for payments. Therefore our objective is to design a mechanism that optimizes a weighted combination of test loss, seller privacy, and payment, striking a balance between building a good privacy-preserving ML model and minimizing payments to the sellers. To achieve this, we first propose an approach to solve logistic regression with known heterogeneous differential privacy guarantees. Building on these results and leveraging standard mechanism design theory, we develop a two-step optimization framework. We further extend this approach to an online algorithm that handles the sequential arrival of sellers.
翻译:暂无翻译