We examine the relationship between the mutual information between the output model and the empirical sample and the generalization of the algorithm in the context of stochastic convex optimization. Despite increasing interest in information-theoretic generalization bounds, it is uncertain if these bounds can provide insight into the exceptional performance of various learning algorithms. Our study of stochastic convex optimization reveals that, for true risk minimization, dimension-dependent mutual information is necessary. This indicates that existing information-theoretic generalization bounds fall short in capturing the generalization capabilities of algorithms like SGD and regularized ERM, which have dimension-independent sample complexity.
翻译:我们研究产出模型与实验样本之间的相互信息关系,以及在随机孔流优化背景下对算法的概括化关系。尽管人们日益关注信息理论的概括化界限,但不确定这些界限能否为各种学习算法的特殊性提供洞察力。我们对随机孔流优化的研究显示,为了最大限度地减少真实风险,需要基于维量的相互信息。这表明,现有信息理论的概括化界限不足以掌握SGD和正规化机构风险管理等具有维度独立的抽样复杂性的算法的概括化能力。