关于不完全观察的逐步重要性学习 (Gradient Importance Learning for Incomplete Observations)

Though recent works have developed methods that can generate estimates (or imputations)of the missing entries in a dataset to facilitate downstream analysis, most depend onassumptions that may not align with real-world applications and could suffer from poorperformance in subsequent tasks such as classification. This is particularly true if the datahave large missingness rates or a small sample size. More importantly, the imputationerror could be propagated into the prediction step that follows, which may constrain thecapabilities of the prediction model. In this work, we introduce the gradient importancelearning (GIL) method to train multilayer perceptrons (MLPs) and long short-term memo-ries (LSTMs) todirectlyperform inference from inputs containing missing valueswithoutimputation. Specifically, we employ reinforcement learning (RL) to adjust the gradientsused to train these models via back-propagation. This allows the model to exploit theunderlying information behindmissingness patterns. We test the approach on real-worldtime-series (i.e., MIMIC-III), tabular data obtained from an eye clinic, and a standarddataset (i.e., MNIST), where ourimputation-freepredictions outperform the traditionaltwo-stepimputation-based predictions using state-of-the-art imputation methods.

翻译：尽管最近的工作已经开发出一些方法,可以对数据集中缺失的条目进行估计(或估算),以便利下游分析,但多数取决于可能与现实应用不相符的假设和在分类等后续任务中可能表现不佳的假设。如果数据缺少率高或抽样规模小,情况尤其如此。更重要的是,估算仪可以传播到随后的预测步骤中,这可能限制预测模型的能力。在这项工作中,我们采用了梯度重要性学习方法,以培训多层透视器(MLPs)和长期短期回忆录(LSTMs),从含有缺失值的输入中直接得出准确的推论。具体地说,我们采用强化学习(RL)来调整用于通过反演化来训练这些模型的梯度。这可以使模型利用基于预测模式的信息误差模式。我们测试了现实世界时间序列(i.i.MIC-III)的方法,从一个视距分析诊所获得的图表数据,以及使用传统数据模型的模型(i.i.i.b.i.i.i.i.,MIC-III),从一个自由分析诊所获得的列表数据,并使用一种标准化数据。

相关内容

深度前馈网络

关注 6

深度前馈网络（deep feedforward network），也叫做前馈神经网络（feedforward neural network）或者多层感知机（multilayer perceptron, MLP）,是典型的深度学习模型。前馈网络的目标是近似某个函数 f^∗ 。例如，对于分类器，y = f^∗ (x)将输入x映射到一个类别y。前馈网络定义了一个映射y = f (x; θ)，并且学习参数θ的值使它能够得到最佳的函数近似。

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【DeepMind】强化学习教程，83页ppt

专知会员服务

158+阅读 · 2020年8月7日

【伯克利】机器学习蛋白质工程，Machine learning for protein engineering，83页ppt

专知会员服务

36+阅读 · 2020年5月9日

【UIUC硬核书】统计学习理论，Statistical Learning Theory，213页pdf

专知会员服务

134+阅读 · 2020年4月14日