通过强化学习进行视觉探索和有能源意识的路径规划 (Visual Exploration and Energy-aware Path Planning via Reinforcement Learning)

Visual exploration and smart data collection via autonomous vehicles is an attractive topic in various disciplines. Disturbances like wind significantly influence both the power consumption of the flying robots and the performance of the camera. We propose a reinforcement learning approach which combines the effects of the power consumption and the object detection modules to develop a policy for object detection in large areas with limited battery life. The learning model enables dynamic learning of the negative rewards of each action based on the drag forces that is resulted by the motion of the flying robot with respect to the wind field. The algorithm is implemented in a near-real world simulation environment both for the planar motion and flight in different altitudes. The trained agent often performed a trade-off between detecting the objects with high accuracy and increasing the area coverage within its battery life. The developed exploration policy outperformed the complete coverage algorithm by minimizing the traveled path while finding the target objects. The performance of the algorithms under various wind fields was evaluated in planar and 3D motion. During an exploration task with sparsely distributed goals and within a UAV's battery life, the proposed architecture could detect more than twice the amount of goal objects compared to the coverage path planning algorithm in moderate wind field. In high wind intensities, the energy-aware algorithm could detect 4 times the amount of goal objects when compared to its complete coverage counterpart.

翻译：通过自主飞行器进行视觉探索和智能数据收集是不同学科中一个有吸引力的主题。风等扰动对飞行机器人的动力消耗和摄影机的性能产生显著影响。我们建议采用强化学习方法,将电耗效应和物体探测模块结合起来,以制定在电池寿命有限的大地区探测物体的政策;学习模式使得能够动态地了解飞行机器人运动对风场运动产生的拖动力的每项行动的负面回报。算法是在近现实的世界模拟环境中实施的,用于飞机运动和不同高度飞行。受过训练的代理人往往在以高精度探测物体和增加电池寿命覆盖面积之间进行了权衡。开发的勘探政策在寻找目标物体的同时,尽量减少了行进路径,从而超越了完整覆盖的算法。在平面和3D运动中,对各种风场下算法的运作情况进行了评价。在目标分布不甚广的勘探任务中,在UAV的电池寿命中,拟议建筑可以探测目标物体的两倍以上,而目标物体的总数则比其电池寿命的覆盖面高,在中测算法中,可以探测到中风场的完整速度速度,从而测测算。