特征选择( Feature Selection )也称特征子集选择( Feature Subset Selection , FSS ),或属性选择( Attribute Selection )。是指从已有的M个特征(Feature)中选择N个特征使得系统的特定指标最优化,是从原始特征中选择出一些最有效特征以降低数据集维度的过程,是提高学习算法性能的一个重要手段,也是模式识别中关键的数据预处理步骤。对于一个学习算法来说,好的学习样本是训练模型的关键。

VIP内容

简介: 迁移学习作为机器学习的一大分支,已经取得了长足的进步。本手册简明地介绍迁移学习的概念与基本方法,并对其中的领域自适应问题中的若干代表性方法进行讲述。最后简要探讨迁移学习未来可能的方向。 本手册编写的目的是帮助迁移学习领域的初学者快速入门并掌握基本方法,为自己的研究和应用工作打下良好基础。 本手册的编写逻辑很简单:是什么——介绍迁移学习;为什么——为什么要用迁移学习、为什么能用;怎么办——如何进行迁移 (迁移学习方法)。其中,是什么和为什么解决概念问题,这是一切的前提;怎么办是我们的重点,也占据了最多的篇幅。为了最大限度地方便初学者,我们还特别编写了一章上手实践,直接分享实现代码和心得体会。

作者简介: 王晋东,现于中国科学院计算技术研究所攻读博士学位,研究方向为迁移学习、机器学习等。他在国际权威会议ICDM、UbiComp等发表多篇文章。同时,也是知乎等知识共享社区的机器学习达人(知乎用户名:王晋东不在家)。他还在Github上发起建立了多个与机器学习相关的资源仓库,成立了超过120个高校和研究所参与的机器学习群,热心于知识的共享。个人主页:http://jd92.wang

目录:

  • 迁移学习基本概念
  • 迁移学习的研究领域
  • 迁移学习的应用
  • 基础知识
  • 迁移学习的基本方法
  • 第一类方法:数据分布自适应
  • 第二类方法:特征选择
  • 第三类方法:子空间学习
  • 深度迁移学习
  • 上手实践
  • 迁移学习前沿
成为VIP会员查看完整内容
0
79

最新内容

Autonomous systems generate a huge amount of multimodal data that are collected and processed on the Edge, in order to enable AI-based services. The collected datasets are pre-processed in order to extract informative attributes, called features, which are used to feed AI algorithms. Due to the limited computational and communication resources of some CPS, like autonomous vehicles, selecting the subset of relevant features from a dataset is of the utmost importance, in order to improve the result achieved by learning methods and to reduce computation and communication costs. Precisely, feature selection is the candidate approach, which assumes that data contain a certain number of redundant or irrelevant attributes that can be eliminated. The quality of our methods is confirmed by the promising results achieved on two different data sets. In this work, we propose, for the first time, a federated feature selection method suitable for being executed in a distributed manner. Precisely, our results show that a fleet of autonomous vehicles finds a consensus on the optimal set of features that they exploit to reduce data transmission up to 99% with negligible information loss.

0
0
下载
预览

最新论文

Autonomous systems generate a huge amount of multimodal data that are collected and processed on the Edge, in order to enable AI-based services. The collected datasets are pre-processed in order to extract informative attributes, called features, which are used to feed AI algorithms. Due to the limited computational and communication resources of some CPS, like autonomous vehicles, selecting the subset of relevant features from a dataset is of the utmost importance, in order to improve the result achieved by learning methods and to reduce computation and communication costs. Precisely, feature selection is the candidate approach, which assumes that data contain a certain number of redundant or irrelevant attributes that can be eliminated. The quality of our methods is confirmed by the promising results achieved on two different data sets. In this work, we propose, for the first time, a federated feature selection method suitable for being executed in a distributed manner. Precisely, our results show that a fleet of autonomous vehicles finds a consensus on the optimal set of features that they exploit to reduce data transmission up to 99% with negligible information loss.

0
0
下载
预览
Top