高效率地报告顶k子集成总和 (Efficient Reporting of Top-k Subset Sums)

The "Subset Sum problem" is a very well-known NP-complete problem. In this work, a top-k variation of the "Subset Sum problem" is considered. This problem has wide application in recommendation systems, where instead of k best objects the k best subsets of objects with the lowest (or highest) overall scores are required. Given an input set R of n real numbers and a positive integer k, our target is to generate the k best subsets of R such that the sum of their elements is minimized. Our solution methodology is based on constructing a metadata structure G for a given n. Each node of G stores a bit vector of size n from which a subset of R can be retrieved. Here it is shown that the construction of the whole graph G is not needed. To answer a query, only implicit traversal of the required portion of G on demand is sufficient, which obviously gets rid of the preprocessing step, thereby reducing the overall time and space requirement. A modified algorithm is then proposed to generate each subset incrementally, where it is shown that it is possible to do away with the explicit storage of the bit vector. This not only improves the space requirement but also improves the asymptotic time complexity. Finally, a variation of our algorithm that reports only the top-k subset sums has been compared with an existing algorithm, which shows that our algorithm performs better both in terms of time and space requirement by a constant factor.

翻译：“ Subset Sum problem” 是众所周知的 NP 问题。在此工作中, “ Subset Sum problem” 将考虑“ Subset Sum problem” 的最大变量。这个问题在建议系统中广泛应用, 而不是 k 最佳对象, 需要的是最低( 或最高) 总分的 k 最佳对象的 k 子集。由于输入数据集为n 真实数字, 且正整数 k, 我们的目标是生成 R 的 k 最佳子集, 从而最小化其元素的总和。我们的解决方案方法基于为给定 n. 每个 G 节点构建一个小量的 G 元件结构 G 。每一个节点可以从中检索 R 的子集。这里显示, 整个图表 G 的构建不需要。要回答一个问题, 只有G 所需的 G 部分的隐含的曲解是足够了, 这显然可以消除预处理步骤, 从而减少总的时间和空间需求。然后建议一个修改的算法, 来生成每个子集, 。在其中显示它有可能与明确存储量矢量矢量的存储中, 。这不但也只是改进了我们的演算法的演算方法, 改进了我们目前的演算法的演算方法, 。