“用户行为数据”稀疏表示的理论与方法 - 专知基金

会员服务 ·

0

面向用户需求的属性约简 · 多属性决策 · 聚类分析 ·

2012 年 12 月 31 日

“用户行为数据”稀疏表示的理论与方法

国家自然科学基金

国家自然科学基金委员会

项目名称： “用户行为数据”稀疏表示的理论与方法

项目编号： No.61273294

项目类型： 面上项目

立项/批准年度： 2013

项目学科： 自动化技术、计算机技术

项目作者： 韩素青

作者单位： 太原师范学院

项目金额： 46万元

中文摘要： 稀疏表示(Sparse Representation)是机器学习研究的一个重要课题，而有用户需求或偏好的"用户行为数据"的分析与处理是近几年来网络服务商提出的主要任务之一。在统计机器学习中，L1正则化是实现数据稀疏表示的主要途径。但是，对于"用户行为数据"，如果使用L1正则化方法，势必需要把符号数据不合理地理解为连续数据。事实上，针对具体问题，如果在符号数据集上关于样本能够定义出相应的区分关系，就可以根据数据的内在结构获得特征意义上的稀疏表示，并且获得样本意义上的稀疏表示，但这个问题已不再是L1正则化的任务了。而概率图模型理论在数据的稀疏表示和稀疏数据学习方面有较强的优势，因此，本项目试图借助该理论，基于符号机器学习方法，发展能够处理用户行为数据稀疏化的表示理论与算法，一方面避开不合理的"符号数据实数化"，另一方面绕开最小二乘这类比较费时的计算，使稀疏化的过程和结果变得可解释的。

中文关键词： 用户行为数据；面向用户需求的属性约简；多属性决策；聚类分析；

英文摘要： Sparse Representation is one of the significant research topics in machine learning. In recent years, network provider has proposed an important task to analyze and process the data of user behavior, which reflects users' demands and preferences. L1 Regularization is a curcial method to perform Sparse Representation of data in statistical machine learning. However, when analyzing the data of user behavior, Regularzization requires symbolic data to be discrete, which is unreasonable and unnesessary. In fact, for specific problems, as long as cooresponding distinction relationship of samples in the symbolic data set were defined, it is posible to obtain the sparse representaion with regard to the feature significance, based on the internal structure of data. The sparse representation with regard to the sample significance can also be obtained, but not within the consideration of L1 Regularization. This project developed theorital representation and algorithm to process the sparsed data of user behavior, and made the process and results of sparse explainable. In addition, the research results would promote the research and development of this area.

英文关键词： the data of user behavior；user-oriented attribute reduct；multi-attribute decision making；clustering analysis；

成为VIP会员查看完整内容

1

相关内容

面向用户需求的属性约简

面向用户需求的属性约简

ICML'21：一种计算用户嵌入表示的新型协同过滤方法

ICML'21：一种计算用户嵌入表示的新型协同过滤方法

专知会员服务

15+阅读 · 2021年12月31日

【博士论文】开放环境下的度量学习研究

【博士论文】开放环境下的度量学习研究

专知会员服务

49+阅读 · 2021年12月4日

【干货书】概率，统计与数据，513页pdf

【干货书】概率，统计与数据，513页pdf

专知会员服务

140+阅读 · 2021年11月27日

清华大学黄民烈：本科生如何做出好的科研

清华大学黄民烈：本科生如何做出好的科研

专知会员服务

51+阅读 · 2021年11月26日

【CIKM2021】基于整合用户序列的搜索与推荐

专知会员服务

17+阅读 · 2021年9月18日

【CIKM2021】用户行为序列对比学习的上下文感知文档排序

专知会员服务

21+阅读 · 2021年8月30日

[ICML2021]基于相似置信度学习的算法

专知会员服务

29+阅读 · 2021年6月7日

基于生理信号的情感计算研究综述

基于生理信号的情感计算研究综述

专知会员服务

66+阅读 · 2021年2月9日

Python编程基础，121页ppt

Python编程基础，121页ppt

专知会员服务

49+阅读 · 2021年1月1日

复杂网络的双曲空间表征学习方法

专知会员服务

47+阅读 · 2020年11月13日

多视图多行为对比学习推荐系统

多视图多行为对比学习推荐系统

机器学习与推荐算法

4+阅读 · 2022年3月23日

WWW'22 | 推荐系统：基于邻域关系的对比学习改进图协同过滤

WWW'22 | 推荐系统：基于邻域关系的对比学习改进图协同过滤

RUC AI Box

2+阅读 · 2022年3月21日

产品人面对数据只会做“统计”，数据分析如何避免沦为形式？

产品人面对数据只会做“统计”，数据分析如何避免沦为形式？

人人都是产品经理

0+阅读 · 2022年2月10日

用户复购行为，该如何分析

用户复购行为，该如何分析

人人都是产品经理

0+阅读 · 2021年12月4日

【博士论文】开放环境下的度量学习研究

【博士论文】开放环境下的度量学习研究

专知

7+阅读 · 2021年12月4日

正则化方法小结

正则化方法小结

极市平台

2+阅读 · 2021年11月24日

四种方法，用数据挖掘潜力用户

四种方法，用数据挖掘潜力用户

人人都是产品经理

0+阅读 · 2021年11月4日

WSDM2022 | 跨领域推荐中的个性化迁移用户兴趣偏好

WSDM2022 | 跨领域推荐中的个性化迁移用户兴趣偏好

机器学习与推荐算法

1+阅读 · 2021年11月3日

用户分析体系，该如何搭建

用户分析体系，该如何搭建

人人都是产品经理

0+阅读 · 2021年10月20日

【入门】数据分析六部曲

【入门】数据分析六部曲

36大数据

18+阅读 · 2017年12月6日

基于稀疏表示和流形理论的半监督分类研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向用户的数据质量管理方法研究

国家自然科学基金

6+阅读 · 2014年12月31日

稀疏支持向量机的理论、算法及应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于混合属性分析的人体行为识别方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

多重稀疏特性的核子空间分析理论与应用

国家自然科学基金

1+阅读 · 2012年12月31日

基于网络业务特征的用户行为及虚拟映射技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于视觉显著性和稀疏表示的图像质量评价

国家自然科学基金

1+阅读 · 2012年12月31日

稀疏与冗余表征的理论及应用研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于稀疏表示和超图的视频事件语义分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于数据学习的高斯过程混合体的模型选择及其应用研究

国家自然科学基金

1+阅读 · 2011年12月31日

Improving generalization of machine learning-identified biomarkers with causal modeling: an investigation into immune receptor diagnostics

Arxiv

0+阅读 · 2022年4月20日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

Data Augmentation for Graph Neural Networks

Arxiv

38+阅读 · 2020年12月2日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Few-shot Learning with Meta Metric Learners

Arxiv

13+阅读 · 2019年1月26日

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Arxiv

21+阅读 · 2018年12月25日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

阅读: 0 点赞: 0

小贴士

登录享主题订阅及个性化推荐

相关主题

面向用户需求的属性约简

多属性决策

热门VIP内容

开通专知VIP会员享更多权益服务

面向性能、成本效益、云边隐私与可信性的大小语言模型协作综述

乌克兰太空研究（2022-2024年） | 176页

【CMU博士论文】大型语言模型的隐性特性

国防领域人工智能走向何方？

相关VIP内容

ICML'21：一种计算用户嵌入表示的新型协同过滤方法

ICML'21：一种计算用户嵌入表示的新型协同过滤方法

专知会员服务

15+阅读 · 2021年12月31日

【博士论文】开放环境下的度量学习研究

【博士论文】开放环境下的度量学习研究

专知会员服务

49+阅读 · 2021年12月4日

【干货书】概率，统计与数据，513页pdf

【干货书】概率，统计与数据，513页pdf

专知会员服务

140+阅读 · 2021年11月27日

清华大学黄民烈：本科生如何做出好的科研

清华大学黄民烈：本科生如何做出好的科研

专知会员服务

51+阅读 · 2021年11月26日

【CIKM2021】基于整合用户序列的搜索与推荐

专知会员服务

17+阅读 · 2021年9月18日

【CIKM2021】用户行为序列对比学习的上下文感知文档排序

专知会员服务

21+阅读 · 2021年8月30日

[ICML2021]基于相似置信度学习的算法

专知会员服务

29+阅读 · 2021年6月7日

基于生理信号的情感计算研究综述

基于生理信号的情感计算研究综述

专知会员服务

66+阅读 · 2021年2月9日

Python编程基础，121页ppt

Python编程基础，121页ppt

专知会员服务

49+阅读 · 2021年1月1日

复杂网络的双曲空间表征学习方法

专知会员服务

47+阅读 · 2020年11月13日

相关资讯

多视图多行为对比学习推荐系统

多视图多行为对比学习推荐系统

机器学习与推荐算法

4+阅读 · 2022年3月23日

WWW'22 | 推荐系统：基于邻域关系的对比学习改进图协同过滤

WWW'22 | 推荐系统：基于邻域关系的对比学习改进图协同过滤

RUC AI Box

2+阅读 · 2022年3月21日

产品人面对数据只会做“统计”，数据分析如何避免沦为形式？

产品人面对数据只会做“统计”，数据分析如何避免沦为形式？

人人都是产品经理

0+阅读 · 2022年2月10日

用户复购行为，该如何分析

用户复购行为，该如何分析

人人都是产品经理

0+阅读 · 2021年12月4日

【博士论文】开放环境下的度量学习研究

【博士论文】开放环境下的度量学习研究

专知

7+阅读 · 2021年12月4日

正则化方法小结

正则化方法小结

极市平台

2+阅读 · 2021年11月24日

四种方法，用数据挖掘潜力用户

四种方法，用数据挖掘潜力用户

人人都是产品经理

0+阅读 · 2021年11月4日

WSDM2022 | 跨领域推荐中的个性化迁移用户兴趣偏好

WSDM2022 | 跨领域推荐中的个性化迁移用户兴趣偏好

机器学习与推荐算法

1+阅读 · 2021年11月3日

用户分析体系，该如何搭建

用户分析体系，该如何搭建

人人都是产品经理

0+阅读 · 2021年10月20日

【入门】数据分析六部曲

【入门】数据分析六部曲

36大数据

18+阅读 · 2017年12月6日

相关基金

基于稀疏表示和流形理论的半监督分类研究

国家自然科学基金

0+阅读 · 2014年12月31日

面向用户的数据质量管理方法研究

国家自然科学基金

6+阅读 · 2014年12月31日

稀疏支持向量机的理论、算法及应用研究

国家自然科学基金

0+阅读 · 2013年12月31日

基于混合属性分析的人体行为识别方法研究

国家自然科学基金

2+阅读 · 2013年12月31日

多重稀疏特性的核子空间分析理论与应用

国家自然科学基金

1+阅读 · 2012年12月31日

基于网络业务特征的用户行为及虚拟映射技术研究

国家自然科学基金

1+阅读 · 2012年12月31日

基于视觉显著性和稀疏表示的图像质量评价

国家自然科学基金

1+阅读 · 2012年12月31日

稀疏与冗余表征的理论及应用研究

国家自然科学基金

1+阅读 · 2011年12月31日

基于稀疏表示和超图的视频事件语义分析方法研究

国家自然科学基金

0+阅读 · 2011年12月31日

基于数据学习的高斯过程混合体的模型选择及其应用研究

国家自然科学基金

1+阅读 · 2011年12月31日

相关论文

Improving generalization of machine learning-identified biomarkers with causal modeling: an investigation into immune receptor diagnostics

Arxiv

0+阅读 · 2022年4月20日

Contrastive Spatio-Temporal Pretext Learning for Self-supervised Video Representation

Arxiv

11+阅读 · 2021年12月16日

Disentangled Information Bottleneck

Disentangled Information Bottleneck

Arxiv

12+阅读 · 2020年12月22日

Data Augmentation for Graph Neural Networks

Arxiv

38+阅读 · 2020年12月2日

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems

Arxiv

11+阅读 · 2019年11月4日

Learning Discrete Structures for Graph Neural Networks

Arxiv

17+阅读 · 2019年3月28日

Few-shot Learning with Meta Metric Learners

Arxiv

13+阅读 · 2019年1月26日

Learning to Propagate Labels: Transductive Propagation Network for Few-shot Learning

Arxiv

21+阅读 · 2018年12月25日

Causal Embeddings for Recommendation

Arxiv

23+阅读 · 2018年8月3日

Multilingual Sentiment Analysis: An RNN-Based Framework for Limited Data

Arxiv

12+阅读 · 2018年6月8日

微信扫码咨询专知VIP会员