为了解决跨域视频时空变化的问题,百度提出了自我监督的时域自适应(SSTDA),通过两个自我监督的辅助任务,域和顺序域预测,将嵌入局部和全局时间动态的跨域特征空间联合对齐以更好地实现视频动作分割实验结果表明,SSTDA通过更有效地对齐时间动态,优于其他DA方法。在三个公开数据集上的实验表明了本文方法的有效性。主要参考文献:[1]Yazan Abu Farha and Jurgen Gall.Ms-tcn: Multi-stage temporal convolutional network for action segmentation. InIEEE Conference on Computer Vision and PatternRecognition (CVPR), 2019.[2] Dejing Xu, Jun Xiao, Zhou Zhao, Jian Shao, Di Xie, andYueting Zhuang. Self-supervised spatiotemporal learning via video clip orderprediction. In IEEE Conference on Computer Visionand Pattern Recognition (CVPR), 2019.[3] Mingsheng Long, Han Zhu, Jianmin Wang, and Michael IJordan. Deep transfer learning with joint adaptation networks. In International Conference onMachine Learning (ICML), 2017.