Maximizing a monotone submodular function is a fundamental task in machine learning. In this paper, we study the deletion robust version of the problem under the classic matroids constraint. Here the goal is to extract a small size summary of the dataset that contains a high value independent set even after an adversary deleted some elements. We present constant-factor approximation algorithms, whose space complexity depends on the rank $k$ of the matroid and the number $d$ of deleted elements. In the centralized setting we present a $(3.582+O(\varepsilon))$-approximation algorithm with summary size $O(k + \frac{d \log k}{\varepsilon^2})$. In the streaming setting we provide a $(5.582+O(\varepsilon))$-approximation algorithm with summary size and memory $O(k + \frac{d \log k}{\varepsilon^2})$. We complement our theoretical results with an in-depth experimental analysis showing the effectiveness of our algorithms on real-world datasets.
翻译:最大限度地增加单调子模块函数是机器学习的一项基本任务。 在本文中, 我们研究在经典的类固醇约束下, 是否删除了这一问题的稳健版本。 这里的目标是提取一个包含高值独立的数据集的小缩略图, 即使对手删除了某些元素 。 我们提出恒定因素近似算法, 其空间复杂性取决于机器人的等级 $k$ 和被删除元素的数值 。 在集中设置中, 我们提出一个$( 3. 582+O (\ varepsilon)) $- approximation 算法, 使用 $( k +\ frac{ d\ log kunvarepsilon) $ 。 在流放设置中, 我们提供一个$( 5. 582+ O (\ varepsilon) ) $- accorxion logation 算法, 其空间复杂性取决于 $O (k +\ frac{d) = log kunparevarepslon=2} 。 我们用一个深度实验性分析结果来补充我们的理论结果, 显示我们对真实世界数据设置的有效性。