Statistical machine learning theory often tries to give generalization guarantees of machine learning models. Those models naturally underlie some fluctuation, as they are based on a data sample. If we were unlucky, and gathered a sample that is not representative of the underlying distribution, one cannot expect to construct a reliable machine learning model. Following that, statements made about the performance of machine learning models have to take the sampling process into account. The two common approaches for that are to generate statements that hold either in high-probability, or in-expectation, over the random sampling process. In this short note we show how one may transform one statement to another. As a technical novelty we address the case of unbounded loss function, where we use a fairly new assumption, called the witness condition.
翻译:统计机器学习理论常常试图为机器学习模型提供一般化的保证。 这些模型自然地成为某些波动的基础,因为它们是以数据样本为基础。 如果我们运气不好,并且收集了一个不能代表基本分布的样本,人们就不能期望建立一个可靠的机器学习模型。 之后,关于机器学习模型的性能的叙述必须考虑到抽样过程。 两种共同的方法是生成在随机抽样过程中要么处于高概率,要么处于预期的状态的报表。 在这个简短的注释中,我们展示了如何将一个声明转换为另一个声明。作为一个技术创新,我们处理的是无约束损失功能的案例,我们使用一种相当新的假设,称之为证人条件。