While multinomial logistic regression is a useful tool for classification among multiple categories, the posterior sampling of Bayesian implementations is computationally burdensome when the number of categories is large. In this paper, we show that the appropriate data augmentation technique provides faster posterior sampling than alternatives in the literature. This speed up comes from two sources: simpler posterior conditional distributions on the coefficients and the ability to parallelize parameter draws. In simulation studies, we demonstrate that the effective sampling rate of our posterior sampling approach is double that of competing methods when working with a large number of categories, even without parallelized computations. Furthermore, this computation time only increases linearly as the number of categories increases. Our corresponding R package is available on Github.
翻译:虽然多重后勤回归是多种类别分类的有用工具,但巴伊西亚实施过程的事后取样在数量众多时,在计算上是累赘的。在本文中,我们表明适当的数据增强技术比文献中的替代方法提供更快的后继取样。这来自两个来源:关于系数的更简单的后继有条件分布和平行参数绘图的能力。在模拟研究中,我们证明我们的后游取样方法的有效取样率是许多类别工作(即使不进行平行计算)时相互竞争的方法的两倍。此外,这一计算时间随着类别数量的增加而线性地增加。我们在Github上提供了相应的R包。