Practical applications employing deep learning must guarantee inference quality. However, we found that the inference quality of state-of-the-art and state-of-the-practice in practical applications has a long tail distribution. In the real world, many tasks have strict requirements for the quality of deep learning inference, such as safety-critical and mission-critical tasks. The fluctuation of inference quality seriously affects its practical applications, and the quality at the tail may lead to severe consequences. State-of-the-art and state-of-the-practice with outstanding inference quality designed and trained under loose constraints still have poor inference quality under constraints with practical application significance. On the one hand, the neural network models must be deployed on complex systems with limited resources. On the other hand, safety-critical and mission-critical tasks need to meet more metric constraints while ensuring high inference quality. We coin a new term, ``tail quality,'' to characterize this essential requirement and challenge. We also propose a new metric, ``X-Critical-Quality,'' to measure the inference quality under certain constraints. This article reveals factors contributing to the failure of using state-of-the-art and state-of-the-practice algorithms and systems in real scenarios. Therefore, we call for establishing innovative methodologies and tools to tackle this enormous challenge.
翻译:然而,我们发现,在实际应用中,最先进和最先进实践的推论质量有很长的尾部分布。在现实世界中,许多任务对深层次的推论质量有严格的要求,例如安全关键和任务关键的任务。推论质量的波动严重影响了其实际应用,尾尾部的质量可能导致严重后果。在松散的限制下设计和培训的、具有突出推论质量的先进和最新做法在实际应用重要性的限制下仍然具有很差的推论质量。一方面,神经网络模型必须安装在资源有限的复杂系统中。另一方面,安全关键和任务关键的任务需要满足更多的衡量限制,同时确保高推论质量。我们提出一个新的术语,即精确质量,以描述这一基本要求和挑战。我们还提出了一个新的衡量标准,“X-质量”,以测量在某种限制下,神经网络模型必须安装在资源有限的复杂系统中。另一方面,安全关键和任务关键的任务需要在确保高推论质量的同时,满足更多的衡量限制。我们提出了一个新的术语,即精确质量,以描述这一基本要求和挑战。我们还提出了一个新的标准,“X-质量”,用以测量在某种挑战下测量质量的判断质量的判断质量。在某种挑战下,用某种巨大的方法中,从而揭示了各种方法的失败。