Accurately backpropagating the gradient through categorical variables is a challenging task that arises in various domains, such as training discrete latent variable models. To this end, we propose CARMS, an unbiased estimator for categorical random variables based on multiple mutually negatively correlated (jointly antithetic) samples. CARMS combines REINFORCE with copula based sampling to avoid duplicate samples and reduce its variance, while keeping the estimator unbiased using importance sampling. It generalizes both the ARMS antithetic estimator for binary variables, which is CARMS for two categories, as well as LOORF/VarGrad, the leave-one-out REINFORCE estimator, which is CARMS with independent samples. We evaluate CARMS on several benchmark datasets on a generative modeling task, as well as a structured output prediction task, and find it to outperform competing methods including a strong self-control baseline. The code is publicly available.
翻译:通过绝对变量准确反向地对梯度进行直截了当的剖析是一项挑战性任务,它出现在不同领域,例如培训离散潜伏变量模型。 为此,我们建议CARMS(CARMS),这是基于多个相互负相关(联合抗药性)样本的绝对随机变量的无偏向的估算器。CARIM(CARIM)将REINFORCE与基于相交样本的样本结合,以避免重复样本并减少其差异,同时使用重要取样使估计器保持公正性。它概括了二元变量的AARMS抗遗传测量器,即两个类别的CARMS(CARMS),以及LORF/VarGrad(使用独立样本的LOORF/VarGrad) REINFORCE测算器。我们评估CARIM(CARIM)的数个基准数据集,以基因化模型任务为基础,以及结构化输出预测任务,并发现它超越了竞争方法,包括强大的自控基线。代码是公开提供的。