Submodular maximization is a general optimization problem with a wide range of applications in machine learning (e.g., active learning, clustering, and feature selection). In large-scale optimization, the parallel running time of an algorithm is governed by its adaptivity, which measures the number of sequential rounds needed if the algorithm can execute polynomially-many independent oracle queries in parallel. While low adaptivity is ideal, it is not sufficient for an algorithm to be efficient in practice -- there are many applications of distributed submodular optimization where the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of submodular maximization. In this paper, we give the first constant-factor approximation algorithm for maximizing a non-monotone submodular function subject to a cardinality constraint $k$ that runs in $O(\log(n))$ adaptive rounds and makes $O(n \log(k))$ oracle queries in expectation. In our empirical study, we use three real-world applications to compare our algorithm with several benchmarks for non-monotone submodular maximization. The results demonstrate that our algorithm finds competitive solutions using significantly fewer rounds and queries.
翻译:子模最大化是一种具有广泛应用的优化问题,在机器学习领域有诸多应用,例如主动学习、聚类和特征选择。在大规模优化中,算法的并行运行时间由其自适应性控制,即如果算法可以并行执行多项关于多项式次的独立预测,需要的流程次数。虽然低适应性是理想的,但对于分布式子模型优化的许多应用而言,这并不足够 —— 在许多情况下,函数值的数量变得异常昂贵。出于应用考虑,我们研究子模最大化的自适应性和查询复杂度。在本文中,我们提出了一种近似度恒定因子的算法,用于最大化一个非单调子模函数,其主要受到集合大小约束 $k$ 的制约,并在期望情况下运行 $O(\log(n))$ 自适应轮和进行 $O(n \log(k))$ 次预测查询。在我们的实证研究中,我们使用三种真实应用来比较我们的算法与几个非单调子模最大化的基准算法。结果表明,我们的算法使用更少的轮次和查询找到了具有竞争力的解。