We study a two-sided online data ecosystem comprised of an online platform, users on the platform, and downstream learners or data buyers. The learners can buy user data on the platform (to run a statistic or machine learning task). Potential users decide whether to join by looking at the trade-off between i) their benefit from joining the platform and interacting with other users and ii) the privacy costs they incur from sharing their data. First, we introduce a novel modeling element for two-sided data platforms: the privacy costs of the users are endogenous and depend on how much of their data is purchased by the downstream learners. Then, we characterize marketplace equilibria in certain simple settings. In particular, we provide a full characterization in two variants of our model that correspond to different utility functions for the users: i) when each user gets a constant benefit for participating in the platform and ii) when each user's benefit is linearly increasing in the number of other users that participate. In both variants, equilibria in our setting are significantly different from equilibria when privacy costs are exogenous and fixed, highlighting the importance of taking endogeneity in the privacy costs into account. Finally, we provide simulations and semi-synthetic experiments to extend our results to more general assumptions. We experiment with different distributions of users' privacy costs and different functional forms of the users' utilities for joining the platform.
翻译:暂无翻译