Many learning tasks in machine learning can be viewed as taking a gradient step towards minimizing the average loss of a batch of examples in each training iteration. When noise is prevalent in the data, this uniform treatment of examples can lead to overfitting to noisy examples with larger loss values and result in poor generalization. Inspired by the expert setting in on-line learning, we present a flexible approach to learning from noisy examples. Specifically, we treat each training example as an expert and maintain a distribution over all examples. We alternate between updating the parameters of the model using gradient descent and updating the example weights using the exponentiated gradient update. Unlike other related methods, our approach handles a general class of loss functions and can be applied to a wide range of noise types and applications. We show the efficacy of our approach for multiple learning settings, namely noisy principal component analysis and a variety of noisy classification problems.
翻译:机器学习中的许多学习任务可被视为一个梯度步骤,以尽量减少每一培训迭代中平均损失的一组实例。当数据中充斥噪音时,这种统一处理实例的做法可能导致过分适应具有较大损失价值的吵闹实例,造成笼统化。在在线学习专家设置的启发下,我们提出了一个从吵闹实例中学习的灵活方法。具体地说,我们把每个培训实例都当作专家对待,并对所有实例进行分配。我们选择两种做法,一种是利用梯度下移更新模型参数,另一种是利用速率梯度更新示例权重。与其他相关方法不同,我们的方法处理一般的损失功能类别,可以应用于广泛的噪音类型和应用。我们展示了我们处理多种学习环境的方法的功效,即吵闹的主要组成部分分析以及各种吵闹的分类问题。