The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.
翻译:具有代表性的数据集的存在是许多成功的人工智能和机器学习模型的先决条件,然而,这些模型随后的应用往往涉及培训所用数据中代表不足的假设情景,其原因多种多样,从时间和成本限制到道德考虑不等,因此,可靠地使用这些模型,特别是在安全关键应用中,是一个巨大的挑战。利用现有的额外知识来源对于克服纯粹由数据驱动的方法的局限性,并最终提高这些模型的普及能力至关重要。此外,与知识相符的预测对于甚至在代表性不足的情况下作出可信和安全的决定至关重要。这项工作概述了将基于数据的模式与现有知识相结合的文献中的现有技术和方法。所确定的方法按整合、提取和一致性类别编排。特别注意在自主驾驶领域的应用。