使用有条件随机随机场模型和差异性散射编码进行端至端端至端端级精细行动分类和承认 (End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding)

Fine-grained action segmentation and recognition is an important yet challenging task. Given a long, untrimmed sequence of kinematic data, the task is to classify the action at each time frame and segment the time series into the correct sequence of actions. In this paper, we propose a novel framework that combines a temporal Conditional Random Field (CRF) model with a powerful frame-level representation based on discriminative sparse coding. We introduce an end-to-end algorithm for jointly learning the weights of the CRF model, which include action classification and action transition costs, as well as an overcomplete dictionary of mid-level action primitives. This results in a CRF model that is driven by sparse coding features obtained using a discriminative dictionary that is shared among different actions and adapted to the task of structured output learning. We evaluate our method on three surgical tasks using kinematic data from the JIGSAWS dataset, as well as on a food preparation task using accelerometer data from the 50 Salads dataset. Our results show that the proposed method performs on par or better than state-of-the-art methods.

翻译：精细区分动作和识别是一项重要但又具有挑战性的任务。在一个漫长、未剪剪的运动数据序列中,任务在于将每个时间框架的行动分类,并将时间序列分成正确的行动序列。在本文件中,我们提出了一个新框架,将时间条件随机字段模型与基于歧视性稀疏编码的强大框架级代表制相结合。我们引入了一种端对端算法,用于共同学习通用报告格式模型的重量,其中包括行动分类和动作过渡成本,以及中层行动原始词典的过于完整。这导致形成一种通用报告格式模型,该模型的驱动因素是使用不同行动共享的歧视性词典获得的稀少的编码特征,并适应了结构化产出学习的任务。我们用来自JIGSAWS数据集的动态数据来评估我们三项外科任务的方法,并利用来自50萨拉德数据集的加速计数据来评估食品准备任务。我们的结果表明,拟议的方法在平面或优于状态方法上进行。

相关内容

条件随机场

关注 341

条件随机域（场）（conditional random fields，简称 CRF，或CRFs），是一种判别式概率模型，是随机场的一种，常用于标注或分析序列资料，如自然语言文字或是生物序列。如同马尔可夫随机场，条件随机场为具有无向的图模型，图中的顶点代表随机变量，顶点间的连线代表随机变量间的相依关系，在条件随机场中，随机变量 Y 的分布为条件机率，给定的观察值则为随机变量 X。原则上，条件随机场的图模型布局是可以任意给定的，一般常用的布局是链结式的架构，链结式架构不论在训练（training）、推论（inference）、或是解码（decoding）上，都存在效率较高的算法可供演算。

【CVPR2020-小鹏汽车】判别性多模态语音识别, Discriminative Multi-modality SR

专知会员服务

41+阅读 · 2020年5月13日

【SIGIR2020】学习词项区分性，Learning Term Discrimination

专知会员服务

16+阅读 · 2020年4月28日

【CVPR2020-斯坦福】从RGB-D扫描对抗纹理优化，Adversarial Texture Optimization

专知会员服务

17+阅读 · 2020年3月21日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

95+阅读 · 2020年3月12日