Max sliced Wasserstein (Max-SW) distance has been widely known as a solution for redundant projections of sliced Wasserstein (SW) distance. In applications that have various independent pairs of probability measures, amortized projection optimization is utilized to predict the ``max" projecting directions given two input measures instead of using projected gradient ascent multiple times. Despite being efficient, the first issue of the current framework is the violation of permutation invariance property and symmetry property. To address the issue, we propose to design amortized models based on self-attention architecture. Moreover, we adopt efficient self-attention architectures to make the computation linear in the number of supports. Secondly, Max-SW and its amortized version cannot guarantee metricity property due to the sub-optimality of the projected gradient ascent and the amortization gap. Therefore, we propose to replace Max-SW with distributional sliced Wasserstein distance with von Mises-Fisher (vMF) projecting distribution (v-DSW). Since v-DSW is a metric with any non-degenerate vMF distribution, its amortized version can guarantee the metricity when predicting the best discriminate projecting distribution. With the two improvements, we derive self-attention amortized distributional projection optimization and show its appealing performance in point-cloud reconstruction and its downstream applications.
翻译:Max 切片 Wasserstein (Max- SW) 距离被公认为是用于对切片 Walsherstein (SW) 距离进行冗余预测的一种解决方案。 在有各种独立的概率计量对数的应用程序中,使用摊销式投影优化来预测“max” 投影方向,给出了两种输入量度,而不是使用预测梯度的次最佳度乘数倍数。尽管效率高,但当前框架的第一个问题是违反差异性能和对称性能的偏差。为了解决这个问题,我们提议设计基于自我注意结构的摊销模型。此外,我们采用了高效的自控结构,使支持数量中的计算线性能。 其次, Max- SW及其分解式投影版本无法保证量化性,因为预测的梯度为亮度和相差差。 因此,我们提议将Max- SWest-SW 换成分配模式, 以 v-DSW。 由于 v-DSW 是一个衡量标准, 其预测性度的下游分配度是任何不稳度的自我预测性预测性、 演示式的自我预测性分配。