This paper studies the traffic monitoring problem in a road network using a team of aerial robots. The problem is challenging due to two main reasons. First, the traffic events are stochastic, both temporally and spatially. Second, the problem has a non-homogeneous structure as the traffic events arrive at different locations of the road network at different rates. Accordingly, some locations require more visits by the robots compared to other locations. To address these issues, we define an uncertainty metric for each location of the road network and formulate a path planning problem for the aerial robots to minimize the network's average uncertainty. We express this problem as a partially observable Markov decision process (POMDP) and propose a distributed and scalable algorithm based on deep reinforcement learning to solve it. We consider two different scenarios depending on the communication mode between the agents (aerial robots) and the traffic management center (TMC). The first scenario assumes that the agents continuously communicate with the TMC to send/receive real-time information about the traffic events. Hence, the agents have global and real-time knowledge of the environment. However, in the second scenario, we consider a challenging setting where the observation of the aerial robots is partial and limited to their sensing ranges. Moreover, in contrast to the first scenario, the information exchange between the aerial robots and the TMC is restricted to specific time instances. We evaluate the performance of our proposed algorithm in both scenarios for a real road network topology and demonstrate its functionality in a traffic monitoring system.
翻译:本文研究使用航空机器人团队的公路网络中的交通监测问题。 问题之所以具有挑战性,主要有两个原因。 首先,交通事件在时间和空间上都是随机的。 其次,随着交通事件以不同的速度到达公路网络的不同地点,问题的结构不尽相同。 因此,有些地点需要机器人与其他地点相比进行更多的访问。 为了解决这些问题,我们为公路网络的每个地点确定一个不确定的衡量标准,并为航空机器人制定一个路径规划问题,以尽量减少网络的平均不确定性。 我们将此问题表述为部分可观测到的马尔科夫决策过程(POMDP),并提议一种分布和可缩放的算法,其基础是深度强化学习以解决这个问题。 我们考虑两种不同的情况,取决于代理人(航空机器人)和交通管理中心(TMC)之间的通信模式。 第一项假设是,代理人不断与TMC沟通,以发送/接收关于交通事件的实时信息。 因此,代理人对环境有全球和实时的知识。 然而,在第二个假设中,在深度观测中,对机载系统进行有限的空中测算。 在第二个假设中,我们从空中测算到机测测测测的机的机的距离,是,对机测到机测到机测测测程的距离。