In recent years we have witnessed an increase on the development of methods for submodular optimization, which have been motivated by the wide applicability of submodular functions in real-world data-science problems. In this paper, we contribute to this line of work by considering the problem of robust submodular maximization against unexpected deletions, which may occur due to privacy issues or user preferences. Specifically, we consider the minimum number of items an algorithm has to remember, in order to achieve a non-trivial approximation guarantee against adversarial deletion of up to $d$ items. We refer to the set of items that an algorithm has to keep before adversarial deletions as a deletion-robust coreset. Our theoretical contributions are two-fold. First, we propose a single-pass streaming algorithm that yields a $(1-2\epsilon)/(4p)$-approximation for maximizing a non-decreasing submodular function under a general p-matroid constraint and requires a coreset of size $k + d/\epsilon$, where $k$ is the maximum size of a feasible solution. To the best of our knowledge, this is the first work to achieve an (asymptotically) optimal coreset, as no constant-factor approximation is possible with a coreset of size sublinear in $d$. Second, we devise an effective offline algorithm that guarantees stronger approximation ratios with a coreset of size $O(d \log(k)/\epsilon)$. We also demonstrate the superior empirical performance of the proposed algorithms in real-life applications.
翻译:近几年来,我们目睹了子模块优化方法的开发有所增长,其动力是子模块功能在现实世界数据科学问题中的广泛适用性。在本文件中,我们通过考虑强力子模块优化问题,防止意外删除(可能由于隐私问题或用户偏好而发生),为有助于这一一行工作。具体地说,我们考虑一个算法必须记住的最起码的项目数量,以便在一般的p-matroid限制下实现非降低子模块功能的最起码数量,并需要一套用于对抗性删除高达美元的项目的非三重近似保证。我们指的是在进行对抗性删除之前,一个算法必须保留的一系列项目,作为删除-robust核心。我们理论上的贡献是两重的。首先,我们建议一种单流流算法的算法算法,在一般的p-matal 限制下,实现非下降性子模块功能的最低数量,并且需要一套以美元为We/\ epslon值的堆积, 其中美元是最高实际的O值应用。