We introduce a novel approach for temporal activity segmentation with timestamp supervision. Our main contribution is a graph convolutional network, which is learned in an end-to-end manner to exploit both frame features and connections between neighboring frames to generate dense framewise labels from sparse timestamp labels. The generated dense framewise labels can then be used to train the segmentation model. In addition, we propose a framework for alternating learning of both the segmentation model and the graph convolutional model, which first initializes and then iteratively refines the learned models. Detailed experiments on four public datasets, including 50 Salads, GTEA, Breakfast, and Desktop Assembly, show that our method is superior to the multi-layer perceptron baseline, while performing on par with or better than the state of the art in temporal activity segmentation with timestamp supervision.
翻译:我们引入了一种新的时间活动分解方法, 并带有时间戳监督。 我们的主要贡献是一个图形进化网络, 以端到端的方式学习它来利用边框特征和相邻框架之间的连接, 以便从稀疏的时间戳标签中产生浓密的框架标签。 生成的浓密框架标签可以用来训练分解模型。 此外, 我们提出一个框架, 用于交替学习分解模型和图形卷变模型, 前者先是初始化, 然后再迭接地完善所学模型。 四个公共数据集的详细实验, 包括50 萨拉德、 GTEA、 早餐和桌面组, 显示我们的方法优于多层透视线基线, 同时在时间戳监督下, 与时光活动分解的状态相同或更好。