学习飞翔 -- -- 一种具有加强学习多试剂四压板控制技术的PyBullet物理的陀螺环境 (Learning to Fly -- a Gym Environment with PyBullet Physics for Reinforcement Learning of Multi-agent Quadcopter Control)

Robotic simulators are crucial for academic research and education as well as the development of safety-critical applications. Reinforcement learning environments -- simple simulations coupled with a problem specification in the form of a reward function -- are also important to standardize the development (and benchmarking) of learning algorithms. Yet, full-scale simulators typically lack portability and parallelizability. Vice versa, many reinforcement learning environments trade-off realism for high sample throughputs in toy-like problems. While public data sets have greatly benefited deep learning and computer vision, we still lack the software tools to simultaneously develop -- and fairly compare -- control theory and reinforcement learning approaches. In this paper, we propose an open-source OpenAI Gym-like environment for multiple quadcopters based on the Bullet physics engine. Its multi-agent and vision based reinforcement learning interfaces, as well as the support of realistic collisions and aerodynamic effects, make it, to the best of our knowledge, a first of its kind. We demonstrate its use through several examples, either for control (trajectory tracking with PID control, multi-robot flight with downwash, etc.) or reinforcement learning (single and multi-agent stabilization tasks), hoping to inspire future research that combines control theory and machine learning.

翻译：机器人模拟器对于学术研究和教育以及安全关键应用的开发至关重要。强化学习环境 -- -- 简单的模拟,加上奖励功能形式的问题规格 -- -- 对学习算法的开发(和基准)的标准化也很重要。然而,全面的模拟器通常缺乏可移动性和平行性。反之,许多强化学习环境对玩具问题中高样本输送量的权衡现实主义。虽然公共数据集大大有利于深层次学习和计算机的视觉,但我们仍然缺乏软件工具,无法同时开发 -- -- 并公平地比较 -- -- 控制理论和加强学习方法。在本文件中,我们提议为以子弹物理引擎为基础的多四肢电脑提供一个开放源OpenAI Gym式环境。其多试剂和视觉基础是强化学习界面,以及支持现实的碰撞和空气动力效应,这是我们知识中的第一个。我们通过几个例子来展示其用途,要么用于控制(对PID控制进行定向跟踪,多色波飞行,多色波式飞行和增强学习方法方法。等等),要么用于强化(对压压压机和机床进行联合研究,等等)。