A compression function is a map that slims down an observational set into a subset of reduced size, while preserving its informational content. In multiple applications, the condition that one new observation makes the compressed set change is interpreted that this observation brings in extra information and, in learning theory, this corresponds to misclassification, or misprediction. In this paper, we lay the foundations of a new theory that allows one to keep control on the probability of change of compression (called the "risk"). We identify conditions under which the cardinality of the compressed set is a consistent estimator for the risk (without any upper limit on the size of the compressed set) and prove unprecedentedly tight bounds to evaluate the risk under a generally applicable condition of preference. All results are usable in a fully agnostic setup, without requiring any a priori knowledge on the probability distribution of the observations. Not only these results offer a valid support to develop trust in observation-driven methodologies, they also play a fundamental role in learning techniques as a tool for hyper-parameter tuning.
翻译:压缩函数是一个将观测设置缩进一个缩小的子集的地图, 并同时保留其信息内容。 在多个应用程序中, 一项新观测使压缩设置变化的条件被解释为该观测带来额外信息, 在学习理论中, 这相当于分类错误或误用。 在本文中, 我们为一种新的理论打下了基础, 使得人们能够控制压缩( 称为“ 风险 ” ) 的改变概率。 我们确定以下条件: 压缩设置的基点是风险的一致估计器( 压缩集的大小没有任何上限), 并且证明在一般适用的优惠条件下评估风险的前所未有的紧凑界限。 所有结果都可以在完全不可分化的设置中使用, 不需要事先知道观测的概率分布 。 这些结果不仅为建立观察驱动方法的信任提供了有效的支持, 而且在学习技术作为超准度调试的工具方面, 也发挥着根本的作用 。