We study the implicit bias of flow matching (FM) samplers via the lens of empirical flow matching. Although population FM may produce gradient-field velocities resembling optimal transport (OT), we show that the empirical FM minimizer is generally not a gradient field, even when each conditional flow is. Consequently, empirical FM is intrinsically not OT-optimal in the Benamou-Brenier sense. In view of this, we analyze the kinetic energy of generated samples. With Gaussian sources, both instantaneous and integrated kinetic energies exhibit exponential concentration, while heavy-tailed sources lead to polynomial tails. These behaviors are governed primarily by the choice of source distribution rather than the data. Overall, these notes provide a concise mathematical account of the structural and energetic biases arising in empirical FM.
翻译:本文通过经验流匹配的视角研究流匹配采样器的隐式偏差。尽管总体流匹配可能产生类似于最优传输的梯度场速度,但我们证明经验流匹配极小化解通常不是梯度场——即使每个条件流都是梯度场。因此,经验流匹配在本质上不满足Benamou-Brenier意义下的最优传输最优性。基于此,我们分析生成样本的动能特性:对于高斯源分布,瞬时动能与积分动能均呈现指数集中现象;而对于重尾源分布,则表现出多项式尾部特征。这些行为主要受源分布选择而非数据本身的主导。总体而言,本文为经验流匹配中出现的结构性与能量性偏差提供了简明的数学阐释。