Information-theoretic measures have been widely adopted in the design of features for learning and decision problems. Inspired by this, we look at the relationship between i) a weak form of information loss in the Shannon sense and ii) the operation loss in the minimum probability of error (MPE) sense when considering a family of lossy continuous representations (features) of a continuous observation. We present several results that shed light on this interplay. Our first result offers a lower bound on a weak form of information loss as a function of its respective operation loss when adopting a discrete lossy representation (quantization) instead of the original raw observation. From this, our main result shows that a specific form of vanishing information loss (a weak notion of asymptotic informational sufficiency) implies a vanishing MPE loss (or asymptotic operational sufficiency) when considering a general family of lossy continuous representations. Our theoretical findings support the observation that the selection of feature representations that attempt to capture informational sufficiency is appropriate for learning, but this selection is a rather conservative design principle if the intended goal is achieving MPE in classification. Supporting this last point, and under some structural conditions, we show that it is possible to adopt an alternative notion of informational sufficiency (strictly weaker than pure sufficiency in the mutual information sense) to achieve operational sufficiency in learning.
翻译:在设计学习和决定问题的特点时,广泛采取了信息理论措施,因此,我们审视了以下两种情况之间的关系:一)香农意义上的信息损失形式薄弱,香农意义上的信息损失形式薄弱;二)在考虑连续观察中损失连续陈述(功能)的家庭时,行动损失最小概率(MPE)为连续观察中损失的最小概率(MPE),我们提出了一些结果,揭示了这种相互作用。我们的第一种结果是,在采用离散损失代表(量化)而不是原始观察时,将信息损失作为各自业务损失的一个函数,对薄弱的信息损失形式限制较低。我们的主要结果显示,在考虑持续观察中损失连续陈述这一家庭时,行动损失的具体形式消失(或功能不足)意味着消失MPE损失(或功能不足)。我们的理论结论支持这样一种看法,即选择试图获取信息充足性特征的特征表示适合学习,但是如果预期目标正在实现MPE分类,这种选择是一种相当保守的设计原则。我们的主要结果表明,一种消失信息损失的具体形式(即信息充足性概念弱化)意味着,在某种结构意义上,我们从某种层次上看,从一个较可靠的信息充分性概念之下,我们从某种了解了一种较差的判断,从某种了解。