The goal of a typical adaptive sequential decision making problem is to design an interactive policy that selects a group of items sequentially, based on some partial observations, to maximize the expected utility. It has been shown that the utility functions of many real-world applications, including pooled-based active learning and adaptive influence maximization, satisfy the property of adaptive submodularity. However, most of existing studies on adaptive submodular maximization focus on the fully adaptive setting, i.e., one must wait for the feedback from \emph{all} past selections before making the next selection. Although this approach can take full advantage of feedback from the past to make informed decisions, it may take a longer time to complete the selection process as compared with the non-adaptive solution where all selections are made in advance before any observations take place. In this paper, we explore the problem of partial-adaptive submodular maximization where one is allowed to make multiple selections in a batch simultaneously and observe their realizations together. Our approach enjoys the benefits of adaptivity while reducing the time spent on waiting for the observations from past selections. To the best of our knowledge, no results are known for partial-adaptive policies for the non-monotone adaptive submodular maximization problem. We study this problem under both cardinality constraint and knapsack constraints, and develop effective and efficient solutions for both cases. We also analyze the batch query complexity, i.e., the number of batches a policy takes to complete the selection process, of our policy under some additional assumptions.
翻译:典型的适应性顺序决策问题的目标是设计一个互动政策,根据部分观察,按顺序选择一组项目,以尽量扩大预期的效用。已经表明,许多真实世界应用程序的效用功能,包括基于集合的积极学习和适应影响最大化,满足适应性亚调模式特性的特性。然而,关于适应性亚调模式最大化的现有研究大多侧重于完全适应性环境,即,在作出下一个选择之前,必须等待来自过去复杂程度的反馈。虽然这一方法可以充分利用过去反馈,作出知情的决定,但与非适应性解决方案相比,它可能需要更长的时间来完成选择过程,因为所有选择都是在任何观察之前事先作出的。在本文件中,我们探讨了部分适应性亚调模式最大化的问题,即允许同时进行多个选择并共同观察其实现情况。我们的方法享受适应性的好处,同时缩短了等待过去选择的观察时间,以便作出知情决定,但与不适应性解决方案的不适应性解决方案相比,我们的政策选择过程的最佳选择过程是需要更长时间的。我们所了解的次调的分级政策的分级调整结果,对于在不成熟的分级研究中进行。我们所了解的分级的分级的分级的分级选择问题也是问题,我们所了解的分级的分级的分级的分级调整后,对于研究的分级选择问题也是最了解的。我们所了解的分级的分级的分级的分级的分级的分解问题。