Tracking humans that are interacting with the other subjects or environment remains unsolved in visual tracking, because the visibility of the human of interests in videos is unknown and might vary over time. In particular, it is still difficult for state-of-the-art human trackers to recover complete human trajectories in crowded scenes with frequent human interactions. In this work, we consider the visibility status of a subject as a fluent variable, whose change is mostly attributed to the subject's interaction with the surrounding, e.g., crossing behind another object, entering a building, or getting into a vehicle, etc. We introduce a Causal And-Or Graph (C-AOG) to represent the causal-effect relations between an object's visibility fluent and its activities, and develop a probabilistic graph model to jointly reason the visibility fluent change (e.g., from visible to invisible) and track humans in videos. We formulate this joint task as an iterative search of a feasible causal graph structure that enables fast search algorithm, e.g., dynamic programming method. We apply the proposed method on challenging video sequences to evaluate its capabilities of estimating visibility fluent changes of subjects and tracking subjects of interests over time. Results with comparisons demonstrate that our method outperforms the alternative trackers and can recover complete trajectories of humans in complicated scenarios with frequent human interactions.
翻译:与其它对象或环境互动的人类跟踪仍然无法在视觉跟踪中解答,因为视频中人类利益的可见度并不为人所知,而且可能随时间而变化。特别是,最先进的人类跟踪器仍然难以在拥挤的场景中恢复完整的人类轨迹,经常发生人类互动。在这项工作中,我们认为一个主题的可见度是流畅的变量,其变化主要归因于该对象与周围的相互作用,例如,跨过另一个对象,进入建筑物或进入车辆等。我们引入了卡萨和奥氏图(C-AOOG),以代表一个对象的可见度流畅及其活动之间的因果关系,并开发一个概率性图表模型,以共同解释可见度变化(例如从可见到看不见)的原因,并在视频中跟踪人文。我们制定这一联合任务是为了对一个可行的因果图表结构进行反复搜索,以便能够快速搜索算法,例如动态的编程方法。我们采用了一个具有挑战性的视频序列的方法,用以代表一个物体的可视性、可视性、可视性、可视性、可视性、可视性、可视性、可视像像像像像像像像像像像像像像学的对比能力,用以评估其反镜反镜反镜反镜。