Temporal action proposal generation is an challenging and promising task which aims to locate temporal regions in real-world videos where action or event may occur. Current bottom-up proposal generation methods can generate proposals with precise boundary, but cannot efficiently generate adequately reliable confidence scores for retrieving proposals. To address these difficulties, we introduce the Boundary-Matching (BM) mechanism to evaluate confidence scores of densely distributed proposals, which denote a proposal as a matching pair of starting and ending boundaries and combine all densely distributed BM pairs into the BM confidence map. Based on BM mechanism, we propose an effective, efficient and end-to-end proposal generation method, named Boundary-Matching Network (BMN), which generates proposals with precise temporal boundaries as well as reliable confidence scores simultaneously. The two-branches of BMN are jointly trained in an unified framework. We conduct experiments on two challenging datasets: THUMOS-14 and ActivityNet-1.3, where BMN shows significant performance improvement with remarkable efficiency and generalizability. Further, combining with existing action classifier, BMN can achieve state-of-the-art temporal action detection performance.
翻译:实时行动提案的产生是一项具有挑战性和有希望的任务,其目的是将时间区域定位在实际世界的视频中,在那里可以采取行动或发生事件。目前的自下而上的建议提案产生方法可以产生具有精确边界的建议,但不能有效地产生足够可靠的信心评分,以便收回建议。为解决这些困难,我们引入了边界匹配机制,以评价密集分布的提案的可信度评分,这意味着将一个提案作为相匹配的起始和结束边界的对应配对,并将所有密集分布的BBM配对纳入BM信任地图。根据BM机制,我们建议一种有效、高效和端到端的生成建议方法,即称为边界分配网络(BMN),它既产生有精确的时间界限的建议,又产生可靠的信心评分。BMN的两支队伍在一个统一的框架内联合培训。我们试验两个挑战性的数据集:THUMOS-14和活动Net-1.3,其中BMN显示显著的效率和普遍性的绩效。此外,BMNM可以与现有的行动分类人员一道,实现最先进的时间行动探测业绩。