This paper studies the evaluation of learning-based object detection models in conjunction with model-checking of formal specifications defined on an abstract model of an autonomous system and its environment. In particular, we define two metrics -- \emph{proposition-labeled} and \emph{class-labeled} confusion matrices -- for evaluating object detection, and we incorporate these metrics to compute the satisfaction probability of system-level safety requirements. While confusion matrices have been effective for comparative evaluation of classification and object detection models, our framework fills two key gaps. First, we relate the performance of object detection to formal requirements defined over downstream high-level planning tasks. In particular, we provide empirical results that show that the choice of a good object detection algorithm, with respect to formal requirements on the overall system, significantly depends on the downstream planning and control design. Secondly, unlike the traditional confusion matrix, our metrics account for variations in performance with respect to the distance between the ego and the object being detected. We demonstrate this framework on a car-pedestrian example by computing the satisfaction probabilities for safety requirements formalized in Linear Temporal Logic (LTL).
翻译:本文研究对基于学习的物体探测模型的评价,同时对自主系统及其环境的抽象模型确定的正式规格进行模型检查。特别是,我们为评价物体探测而定义了两种衡量标准 -- -- emph{proposition-labeled}和\emph{clas-lax-labeled}混淆矩阵 -- -- 用于评价物体探测,我们采用这些衡量标准来计算系统安全要求的满意度。虽然混乱矩阵对于比较评估分类和物体探测模型是有效的,但我们的框架填补了两个关键空白。首先,我们将物体探测的绩效与下游高层次规划任务确定的正式要求联系起来。特别是,我们提供了经验性结果,表明选择良好的物体探测算法,就整个系统的正式要求而言,在很大程度上取决于下游规划和控制设计。第二,与传统的混淆矩阵不同,我们的衡量标准记录了自我与被检测对象之间的距离的性能差异。我们用汽车速度模型来证明这一框架,方法是计算在线形图逻辑(LTL)中正式确定的安全要求的满意度概率。