基于地貌融合增强自动编码器的缺失值填充模型 (A Missing Value Filling Model Based on Feature Fusion Enhanced Autoencoder)

With the advent of the big data era, the data quality problem is becoming more and more crucial. Among many factors, data with missing values is one primary issue, and thus developing effective imputation models is a key topic in the research community. Recently, a major research direction is to employ neural network models such as selforganizing mappings or automatic encoders for filling missing values. However, these classical methods can hardly discover correlation features and common features simultaneously among data attributes. Especially,it is a very typical problem for classical autoencoders that they often learn invalid constant mappings, thus dramatically hurting the filling performance. To solve the above problems, we propose and develop a missing-value-filling model based on a feature-fusion-enhanced autoencoder. We first design and incorporate into an autoencoder a hidden layer that consists of de-tracking neurons and radial basis function neurons, which can enhance the ability to learn correlated features and common features. Besides, we develop a missing value filling strategy based on dynamic clustering (MVDC) that is incorporated into an iterative optimization process. This design can enhance the multi-dimensional feature fusion ability and thus improves the dynamic collaborative missing-value-filling performance. The effectiveness of our model is validated by experimental comparisons to many missing-value-filling methods that are tested on seven datasets with different missing rates.

翻译：随着大数据时代的到来,数据质量问题正在变得越来越重要。在许多因素中,缺少值的数据是一个主要问题,因此开发有效的估算模型是研究界的一个关键主题。最近,一个主要研究方向是使用神经网络模型,如自动组织绘图或自动编码器,以填补缺失值。然而,这些古典方法很难同时发现数据属性之间的相互关系和共同特征。特别是,这是古典自动编码器的一个非常典型的问题,他们常常学习无效的常态绘图,从而极大地损害填充性能。为了解决上述问题,我们提议并开发一个缺失值填充模型,这是研究界的一个关键主题。我们首先设计和将一个包含解跟踪神经元和辐射基功能的隐藏层纳入自动编码系统。此外,我们开发一个缺失值填充战略,其基础是动态组合(MVDC),从而极大地伤害了填充性。这一设计可以增强多维值的增值模型,从而通过测试我们缺失的模型,从而改进了我们缺少的变现能力,从而改进了我们不同维度的变现率。

相关内容

自编码器

关注 140

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日