This notebook paper describes our system for the untrimmed classification task in the ActivityNet challenge 2016. We investigate multiple state-of-the-art approaches for action recognition in long, untrimmed videos. We exploit hand-crafted motion boundary histogram features as well feature activations from deep networks such as VGG16, GoogLeNet, and C3D. These features are separately fed to linear, one-versus-rest support vector machine classifiers to produce confidence scores for each action class. These predictions are then fused along with the softmax scores of the recent ultra-deep ResNet-101 using weighted averaging.
翻译:本笔记本论文描述了我们在2016年ApplicNet挑战中未修剪的分类任务系统。 我们在长长、未剪剪的视频中调查了多种最先进的行动识别方法。 我们开发了手动运动边界直方图特征以及诸如VGG16、GoogLeNet和C3D等深层网络的功能启动功能。 这些特征被分别输入线性、 单反向支持矢量机分类器, 以便为每个行动类别生成信任分数。 这些预测与最近的超深ResNet- 101 的软体积分结合, 使用加权平均率 。