Machine learning applications have become ubiquitous. This has led to an increased effort of making machine learning trustworthy. Explainable and fair AI have already matured. They address knowledgeable users and application engineers. For those who do not want to invest time into understanding the method or the learned model, we offer care labels: easy to understand at a glance, allowing for method or model comparisons, and, at the same time, scientifically well-based. On one hand, this transforms descriptions as given by, e.g., Fact Sheets or Model Cards, into a form that is well-suited for end-users. On the other hand, care labels are the result of a certification suite that tests whether stated guarantees hold. In this paper, we present two experiments with our certification suite. One shows the care labels for configurations of Markov random fields (MRFs). Based on the underlying theory of MRFs, each choice leads to its specific rating of static properties like, e.g., expressivity and reliability. In addition, the implementation is tested and resource consumption is measured yielding dynamic properties. This two-level procedure is followed by another experiment certifying deep neural network (DNN) models. There, we draw the static properties from the literature on a particular model and data set. At the second level, experiments are generated that deliver measurements of robustness against certain attacks. We illustrate this by ResNet-18 and MobileNetV3 applied to ImageNet.
翻译:机器学习应用程序已变得无处不在。 这导致人们更加努力地使机器学习变得可信。 可解释和公正的AI已经成熟。 它们已经成熟了。 它们针对知识丰富的用户和应用工程师。 对于那些不想花时间来理解方法或学习过的模型的人, 我们提供护理标签: 很容易一眼就能理解, 允许方法或模型比较, 同时在科学上基础良好。 一方面, 将概况介绍或模拟网络卡等的描述转换成一种适合终端用户的形式。 另一方面, 护理标签是测试是否持有所声明的保证书的认证套件的结果。 在本文中, 我们介绍两个实验套装的认证套件。 一个展示了Markov随机字段配置的护理标签。 根据MRFs的基本理论, 每种选择导致其静态属性的具体评级, 例如, 直观性和可靠性。 此外, 实施测试和资源消耗情况, 并测量了动态模型的应用。 在本文中, 两种层次的测试程序, 由我们所制作的固定模型 的模型 进行另一种实验 。