Compact data representations are one approach for improving generalization of learned functions. We explicitly illustrate the relationship between entropy and cardinality, both measures of compactness, including how gradient descent on the former reduces the latter. Whereas entropy is distribution sensitive, cardinality is not. We propose a third compactness measure that is a compromise between the two: expected cardinality, or the expected number of unique states in any finite number of draws, which is more meaningful than standard cardinality as it discounts states with negligible probability mass. We show that minimizing entropy also minimizes expected cardinality.
翻译:契约数据表述是改进学习功能普遍化的一种方法。我们明确说明了英特罗比和主要功能之间的关系,两者都是紧凑性衡量标准,包括前者的梯度下降后后者。虽然英特罗比对分布敏感,但主要性不是。我们提议了第三项紧凑性衡量标准,这是两者之间的一种折衷:预期的基度,或任何有限的抽取数中的独特国家的预期数目,这比标准基度更有意义,因为它可以打折扣的概率微乎其微。我们表明,最小化英特罗比也会将预期的基度降到最低。