Evaluating the performance of software for automated vehicles is predominantly driven by data collected from the real world. While professional test drivers are supported with technical means to semi-automatically annotate driving maneuvers to allow better event identification, simple data loggers in large vehicle fleets typically lack automatic and detailed event classification and hence, extra effort is needed when post-processing such data. Yet, the data quality from professional test drivers is apparently higher than the one from large fleets where labels are missing, but the non-annotated data set from large vehicle fleets is much more representative for typical, realistic driving scenarios to be handled by automated vehicles. However, while growing the data from large fleets is relatively simple, adding valuable annotations during post-processing has become increasingly expensive. In this paper, we leverage Z-order space-filling curves to systematically reduce data dimensionality while preserving domain-specific data properties, which allows us to explore even large-scale field data sets to spot interesting events orders of magnitude faster than processing time-series data directly. Furthermore, the proposed concept is based on an analytical approach, which preserves explainability for the identified events.
翻译:评估自动驾驶软件性能通常依赖于从现实世界中收集的数据。专业测试司机通常通过技术手段获得半自动标注,以便更好地识别驾驶动作,但是大型车队中的简单数据记录器通常缺乏自动化和详细的事件分类,因此需要额外的后处理工作。然而,专业测试司机从中收集的数据质量显然比大型车队的数据质量更高,但是来自大型车队的未注释数据集更具有代表性,因为该数据集缺少标签,因此在自动驾驶场景下更具有参考价值。然而,尽管扩大大型车队的数据相对简单,但在后处理过程中添加有价值的标注变得越来越昂贵。在本文中,我们利用Z-order空间填充曲线系统降低数据维度,同时保留特定于汽车领域的数据属性,这使得我们能够探索甚至大规模的实地数据集,以比直接处理时间序列数据快数个数量级的速度捕捉有趣的事件。此外,所提出的概念基于分析方法,保留了所识别事件的可解释性。