The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula, such as kernel and distance matrices. KeOps alleviates the major bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption. It also supports automatic differentiation and outperforms standard GPU baselines, including PyTorch CUDA tensors or the Halide and TVM libraries. KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and GNU R. As a result, high-level "quadratic" codes can now scale up to large data sets with millions of samples processed in seconds. KeOps brings graphics-like performances for kernel methods and is freely available on standard repositories (PyPi, CRAN). To showcase its versatility, we provide tutorials in a wide range of settings online at \url{www.kernel-operations.io}.
翻译:KeOps 图书馆为通过数学公式(如内核和距离矩阵)输入的色子提供快速和记忆高效的GPU支持。 KeOps 减轻了用于内核和几何应用的高压图书馆的主要瓶颈: 内存消耗。 它还支持自动区分和优于标准的GPU基线, 包括PyTorrch CUDA 或Halide 和 TVM 图书馆。 KeOps 将优化的C++/ CUDA 计划与高层次语言的粘合器结合起来: Python (Numpy和PyTorch)、Matlab 和 GNU R。 结果, 高层次的“ qudratic” 代码现在可以扩大为大型数据集, 以几秒钟内处理的样品。 KeOps 将类似图形的性能带给内核处理方法, 并在标准仓库( PyPi, CRAN) 免费提供。为了展示其多功能性,我们在以下的范围广泛的环境中提供辅导。