Social group detection is a crucial aspect of various robotic applications, including robot navigation and human-robot interactions. To date, a range of model-based techniques have been employed to address this challenge, such as the F-formation and trajectory similarity frameworks. However, these approaches often fail to provide reliable results in crowded and dynamic scenarios. Recent advancements in this area have mainly focused on learning-based methods, such as deep neural networks that use visual content or human pose. Although visual content-based methods have demonstrated promising performance on large-scale datasets, their computational complexity poses a significant barrier to their practical use in real-time applications. To address these issues, we propose a simple and efficient framework for social group detection. Our approach explores the impact of motion trajectory on social grouping and utilizes a novel, reliable, and fast data-driven method. We formulate the individuals in a scene as a graph, where the nodes are represented by LSTM-encoded trajectories and the edges are defined by the distances between each pair of tracks. Our framework employs a modified graph transformer module and graph clustering losses to detect social groups. Our experiments on the popular JRDBAct dataset reveal noticeable improvements in performance, with relative improvements ranging from 2% to 11%. Furthermore, our framework is significantly faster, with up to 12x faster inference times compared to state-of-the-art methods under the same computation resources. These results demonstrate that our proposed method is suitable for real-time robotic applications.
翻译:社交团体检测是各种机器人应用的关键方面,包括机器人导航和人机交互。到目前为止,已经采用了一系列基于模型的技术来解决这一挑战,例如F-formation和轨迹相似性框架。然而,在拥挤和动态场景中,这些方法常常无法提供可靠的结果。在这个领域的最新进展主要集中在基于学习的方法,例如使用视觉内容或人体姿态的深度神经网络。虽然基于视觉内容的方法在大规模数据集上表现出有希望的性能,但它们的计算复杂性在实时应用中构成了重大障碍。为了解决这些问题,我们提出了一个简单高效的社交团体检测框架。我们的方法探索了运动轨迹对社交团体的影响,并利用一种新颖、可靠和快速的数据驱动方法。我们将场景中的个体建模为一个图,其中节点由LSTM编码的轨迹表示,边由每对轨迹之间的距离定义。我们的框架采用了修改过的图形变换器模块和图形聚类损失来检测社交团体。我们在流行的JRDBAct数据集上的实验表明,我们的性能有显著的提高,相对提高范围从2%到11%不等。此外,我们的框架速度显著,与使用相同计算资源的最新方法相比,推理时间加快了多达12倍。这些结果表明,我们的方法适用于实时机器人应用。