Autonomous driving has received a lot of attention in the automotive industry and is often seen as the future of transportation. Passenger vehicles equipped with a wide array of sensors (e.g., cameras, front-facing radars, LiDARs, and IMUs) capable of continuous perception of the environment are becoming increasingly prevalent. These sensors provide a stream of high-dimensional, temporally correlated data that is essential for reliable autonomous driving. An autonomous driving system should effectively use the information collected from the various sensors in order to form an abstract description of the world and maintain situational awareness. Deep learning models, such as autoencoders, can be used for that purpose, as they can learn compact latent representations from a stream of incoming data. However, most autoencoder models process the data independently, without assuming any temporal interdependencies. Thus, there is a need for deep learning models that explicitly consider the temporal dependence of the data in their architecture. This work proposes CARNet, a Combined dynAmic autoencodeR NETwork architecture that utilizes an autoencoder combined with a recurrent neural network to learn the current latent representation and, in addition, also predict future latent representations in the context of autonomous driving. We demonstrate the efficacy of the proposed model in both imitation and reinforcement learning settings using both simulated and real datasets. Our results show that the proposed model outperforms the baseline state-of-the-art model, while having significantly fewer trainable parameters.
翻译:汽车业对自主驾驶给予了很多关注,并往往将其视为运输的未来。汽车业对自主驾驶给予了很多关注。汽车业对自主驾驶给予了很多关注,并常常被视为交通的未来。装有一系列广泛的感应器(例如照相机、前视雷达、激光雷达、激光雷达和IMUs)的客运车辆,其对环境的连续感知越来越普遍。这些感应器提供了一系列对可靠自主驾驶至关重要的高维、时间相关数据流。自主驾驶系统应有效利用从各种传感器收集的信息,以便形成对世界的抽象描述,并保持对形势的认识。可以为此目的使用诸如自动编码器等深层学习模型,因为它们能够从不断收到的数据流中学习紧凑的潜伏表示。然而,大多数自动编码模型模型正在独立地处理数据,而没有假定任何时间的相互依存关系。因此,有必要建立深层次的学习模型,明确考虑数据在其结构中的时间依赖性。这项工作提议建立CARNet模型,即一个利用自动编码和经常的神经网络相结合的混合结构结构,以便学习当前潜值的模型,同时利用我们目前的基本结构,并用现在的模型来显示我们的基本结构,同时预测我们的基本结构。