This paper shows that dropout training in Generalized Linear Models is the minimax solution of a two-player, zero-sum game where an adversarial nature corrupts a statistician's covariates using a multiplicative nonparametric errors-in-variables model. In this game, nature's least favorable distribution is dropout noise, where nature independently deletes entries of the covariate vector with some fixed probability $\delta$. This result implies that dropout training indeed provides out-of-sample expected loss guarantees for distributions that arise from multiplicative perturbations of in-sample data. In addition to the decision-theoretic analysis, the paper makes two more contributions. First, there is a concrete recommendation on how to select the tuning parameter $\delta$ to guarantee that, as the sample size grows large, the in-sample loss after dropout training exceeds the true population loss with some pre-specified probability. Second, the paper provides a novel, parallelizable, Unbiased Multi-Level Monte Carlo algorithm to speed-up the implementation of dropout training. Our algorithm has a much smaller computational cost compared to the naive implementation of dropout, provided the number of data points is much smaller than the dimension of the covariate vector.
翻译:本文显示,通用线性模型中的辍学培训是双玩者零和游戏的最小解决方案,在这种游戏中,对抗性的对抗性使统计家使用多种非参数错误变量模型的共变体腐蚀了统计家的共变体。在这个游戏中,自然最不有利的分布是辍学噪音,自然界独立删除了共变矢量的条目,其一定的概率为$\delta$。结果意味着辍学培训确实为多种复制性翻版的数据中产生的分布提供了预期损失保障。除了决策理论分析外,本文还作出两项贡献。首先,关于如何选择调试参数$\delta$的具体建议,以保证随着样本规模的扩大,辍学培训后的全部损失超过真正的人口损失,且具有某些预定的概率。第二,本文提供了一种新颖的、可平行的、不偏差的多级的蒙特卡洛算法,以加速执行辍学培训。我们的数据算法比标准要小得多,比标准化的递增量成本要小得多。