Clustering is an essential primitive in unsupervised machine learning. We bring forth the problem of sublinear-time differentially-private clustering as a natural and well-motivated direction of research. We combine the $k$-means and $k$-median sublinear-time results of Mishra et al. (SODA, 2001) and of Czumaj and Sohler (Rand. Struct. and Algorithms, 2007) with recent results on private clustering of Balcan et al. (ICML 2017), Gupta et al. (SODA, 2010) and Ghazi et al. (NeurIPS, 2020) to obtain sublinear-time private $k$-means and $k$-median algorithms via subsampling. We also investigate the privacy benefits of subsampling for group privacy.
翻译:在无人监督的机器学习中,集群是基本的原始。我们提出亚线性时间差别私营集群问题,作为自然和动机良好的研究方向。我们将Mishra等人(SODA,2001年)和Czumaj和Sohler(Rand. Struct和Algorithms,2007年)的美元和美元等亚线性亚线性组合结果(ICML 2017年)、Gupta等人(SODA,2010年)和Ghazi等人(NeurIPS,2020年)的美元和美元等次线性分组结果结合起来,通过子抽样调查获得亚线性私人美元和美元等次线性交易算法的结果。我们还调查了为群体隐私进行子抽样的隐私利益。