The Word Movers Distance (WMD) measures the semantic dissimilarity between two text documents by computing the cost of optimally moving all words of a source/query document to the most similar words of a target document. Computing WMD between two documents is costly because it requires solving an $O(V^3log(V))$ optimization problem where $V$ is the number of unique words in the document. Fortunately, WMD can be framed as an Earth Mover's Distance (EMD) for which the algorithmic complexity can be reduced to $O(V^2)$ by adding an entropy penalty to the optimization problem and solving it using the Sinkhorn-Knopp algorithm. Additionally, the computation can be made highly parallel by adopting a batching approach, i.e., computing the WMD of a single query document against multiple target documents at once. Sinkhorn WMD is a key kernel used in many ML/NLP applications. and usually gets implemented in Python. However, a straightforward Python implementation may leave significant performance on the table even though it may internally call optimized C++ BLAS routines. We present \emph{PASWD}: a new sparse {P}arallel {A}lgorithm for {S}inkhorn-Knopp {W}ord-movers {D}istance to compute the semantic distance of one document to many other documents by adopting the $O(V^2)$ EMD algorithm. We algorithmically transform $O(V^2)$ dense compute-heavy EMD version into an equivalent sparse one using new fused SDDMM-SpMM (sparse selection of dense-dense matrix-, sparse-dense matrix-multiplication) kernels. We implemented and optimized this algorithm for two very different architectures -- the new Intel Programmable Integrated Unified Memory Architecture (PIUMA) and Intel Xeon CPUs. We show that we were able to reach close to peak performance on both platforms.
翻译:Word Molers Learter (MW) 测量两个文本文档之间的语义差异,方法是计算将源/query文档的所有字词优化移动到目标文档最相似的字词的成本。 在两个文档之间计算大规模毁灭性武器的成本是昂贵的,因为它需要一次性解决$O(V3log(V))$(美元)的优化问题,而美元是文档中唯一单词数。幸运的是,大规模毁灭性武器可以被设置为地球移动器的距离(EMD),其算法复杂性可以降低到$O(V2) 。然而,直接的 Python 执行可能会留下显著的性能,尽管它可能内部调用 C+Olickr horn-Knopprass(Ormal) 的直径解码 Ormal_Ormal_Ormal_MDral_Slickral_Oral_MIDral_Oral_Oral_Oral_Dral_Oral_Dral_O} 将一个单项文档的 Ral-ral-ral-ral-mode_Oral_Oral_Oral_Oral_Oral_Sl_Oral_Oral_O_O_O}我们现在运行运行运行运行运行运行运行运行的二至lxxxxxxxxxxlxxxxxxx, 正在将一个不同,我们,我们运行运行运行到新版本,我们,我们现在-ral-ral-ral-mode_ODD=#程序可以运行到一个新版本。我们,我们,我们,我们运行到这个程序,我们运行到一个或ODDMDDDDDMDDFDFDMDM,我们,我们,我们,我们)。