快速在高尺寸中快速最大产品内部搜索 (Faster Maximum Inner Product Search in High Dimensions)

Maximum Inner Product Search (MIPS) is a popular problem in the machine learning literature due to its applicability in a wide array of applications, such as recommender systems. In high-dimensional settings, however, MIPS queries can become computationally expensive as most existing solutions do not scale well with data dimensionality. In this work, we present a state-of-the-art algorithm for the MIPS problem in high dimensions, dubbed BanditMIPS. BanditMIPS is a randomized algorithm that borrows techniques from multi-armed bandits to reduce the MIPS problem to a best-arm identification problem. BanditMIPS reduces the complexity of state-of-the-art algorithms from $O(\sqrt{d})$ to $O(\text{log}d)$, where $d$ is the dimension of the problem data vectors. On high-dimensional real-world datasets, BanditMIPS runs approximately 12 times faster than existing approaches and returns the same solution. BanditMIPS requires no preprocessing of the data and includes a hyperparameter that practitioners may use to trade off accuracy and runtime. We also propose a variant of our algorithm, named BanditMIPS-$\alpha$, which employs non-uniform sampling across the data dimensions to provide further speedups.

翻译：在机器学习文献中,最大产品搜索(MIPS)是一个广受欢迎的问题,因为它适用于各种应用,例如建议系统。但是,在高维环境中,MIPS查询在计算上会变得非常昂贵,因为大多数现有解决方案与数据维度不相称。在这项工作中,我们为高维的MIPS问题提出了一个最先进的算法,称为Bandbbed BanditMIPS。BanditMIPS是一种随机化算法,从多武装土匪那里借用技术,将MIPS问题降低到最佳武器识别问题。BanditMIPS将最先进的算法的复杂性从$O(sqrt{d})降低到$O(text{log}d)到$(美元),而美元是问题数据矢量的维度。在高维度真实世界数据集上,BanditMIPS运行速度比现有方法快大约12倍,并返回同一解决办法。BanditMIPS不需要预先处理数据,并且包括一个超常数的仪,从业者可以用来转换为不精确度和运行速度。我们还定式的BMSMIS。