We consider the problem of evaluating representations of data for use in solving a downstream task. We propose to measure the quality of a representation by the complexity of learning a predictor on top of the representation that achieves low loss on a task of interest, and introduce two methods, surplus description length (SDL) and $\varepsilon$ sample complexity ($\varepsilon$SC). In contrast to prior methods, which measure the amount of information about the optimal predictor that is present in a specific amount of data, our methods measure the amount of information needed from the data to recover an approximation of the optimal predictor up to a specified tolerance. We present a framework to compare these methods based on plotting the validation loss versus evaluation dataset size (the "loss-data" curve). Existing measures, such as mutual information and minimum description length probes, correspond to slices and integrals along the data axis of the loss-data curve, while ours correspond to slices and integrals along the loss axis. We provide experiments on real data to compare the behavior of each of these methods over datasets of varying size along with a high performance open source library for representation evaluation at https://github.com/willwhitney/reprieve.
翻译:我们考虑评估用于解决下游任务的数据表述情况的问题。我们提议,通过在代表层上学习一个能够实现低损失的预测器的复杂程度来衡量代表质量,并采用两种方法,即剩余描述长度(SDL)和美元(varepsilon美元)样本复杂程度(Varepsilon$SC ) 。与以前测量数据具体数量中存在的最佳预测器信息数量的方法不同,我们的方法衡量从数据中需要的信息数量,以恢复最佳预测器近似到特定容忍度。我们提出了一个框架,用以比较这些方法,其依据是规划验证损失和评估数据集大小(“损失-数据”曲线)的方法。现有的措施,如相互信息和最低描述长度探测器,与损失数据曲线数据轴的切片和密件相对应,而我们的方法则与损失轴中的切片和密件相对应。我们提供实际数据实验,以比较这些方法中每种不同大小数据集的行为方式与高性开放源图书馆相比,以便在 https://giwth/comveview。