Determinantal point processes (DPPs) have attracted significant attention in machine learning for their ability to model subsets drawn from a large item collection. Recent work shows that nonsymmetric DPP (NDPP) kernels have significant advantages over symmetric kernels in terms of modeling power and predictive performance. However, for an item collection of size $M$, existing NDPP learning and inference algorithms require memory quadratic in $M$ and runtime cubic (for learning) or quadratic (for inference) in $M$, making them impractical for many typical subset selection tasks. In this work, we develop a learning algorithm with space and time requirements linear in $M$ by introducing a new NDPP kernel decomposition. We also derive a linear-complexity NDPP maximum a posteriori (MAP) inference algorithm that applies not only to our new kernel but also to that of prior work. Through evaluation on real-world datasets, we show that our algorithms scale significantly better, and can match the predictive performance of prior work.
翻译:磁点过程(DPPs)在机器学习中引起了人们的极大注意,因为机器学习能够模拟从大型物品收藏中提取的子子集。最近的工作表明,非对称的DPP(NDPP)内核在建模力和预测性性能方面比对称内核具有很大的优势。然而,对于规模为$M的物品收藏,现有的NDPP学习和推论算法要求记忆二次方程($M)和运行时间立方格(用于学习)或二次方格(用于推断),用$M(用于学习),用$(用于推断)来模拟从大型物品收集的子子子子集。在这项工作中,我们通过采用新的NDPP内核内核分解法来开发一个空间和时间要求线值为$的学习算法。我们还得出了一种线性兼容性NDPPP最高推论算法,不仅适用于我们的新内核,而且适用于先前的工作。通过对现实世界数据集的评估,我们显示我们的算法规模要大得多,并且可以与先前工作的预测性表现相匹配。