改善預測性能與模型可解釋性：基礎和應用研究視角下的注意力機制 (Improving Prediction Performance and Model Interpretability through Attention Mechanisms from Basic and Applied Research Perspectives)

from arxiv, The bulletin of Graduate School of Science and Engineering, Hosei University, Vol.64 (03/2023). This article draws heavily from arxiv:2009.12064, arxiv:2104.08763, arxiv:1905.07289, and arxiv:2204.11588

With the dramatic advances in deep learning technology, machine learning research is focusing on improving the interpretability of model predictions as well as prediction performance in both basic and applied research. While deep learning models have much higher prediction performance than traditional machine learning models, the specific prediction process is still difficult to interpret and/or explain. This is known as the black-boxing of machine learning models and is recognized as a particularly important problem in a wide range of research fields, including manufacturing, commerce, robotics, and other industries where the use of such technology has become commonplace, as well as the medical field, where mistakes are not tolerated. This bulletin is based on the summary of the author's dissertation. The research summarized in the dissertation focuses on the attention mechanism, which has been the focus of much attention in recent years, and discusses its potential for both basic research in terms of improving prediction performance and interpretability, and applied research in terms of evaluating it for real-world applications using large data sets beyond the laboratory environment. The dissertation also concludes with a summary of the implications of these findings for subsequent research and future prospects in the field.

翻译：隨著深度學習技術的飛速發展，機器學習研究正致力於提高模型預測性能和可解釋度，在基礎和應用研究中都是如此。儘管深度學習模型的預測性能比傳統機器學習模型高得多，但具體的預測過程仍然很難解釋或解釋。這被稱為機器學習模型的黑盒化，是許多研究領域的一個尤其重要的問題，包括製造、商業、機器人和其他使用此類技術的行業，以及醫療領域，這些地方不容許出錯。本文基於作者的論文摘要。論文研究著眼於注意力機制，這是近年來受到關注的焦點，並討論了它對於基礎研究和應用研究的潛在影響。該論文也總結了這些發現對後續研究和未來展望的影響。

相关内容

黑盒

关注 1

在科学，计算和工程学中，黑盒是一种设备，系统或对象，可以根据其输入和输出（或传输特性）对其进行查看，而无需对其内部工作有任何了解。它的实现是“不透明的”（黑色）。几乎任何事物都可以被称为黑盒：晶体管，引擎，算法，人脑，机构或政府。为了使用典型的“黑匣子方法”来分析建模为开放系统的事物，仅考虑刺激/响应的行为，以推断（未知）盒子。该黑匣子系统的通常表示形式是在该方框中居中的数据流程图。黑盒的对立面是一个内部组件或逻辑可用于检查的系统，通常将其称为白盒（有时也称为“透明盒”或“玻璃盒”）。

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【论文翻译】NLP注意力机制综述论文翻译，Attention, please! A Critical Review of Neural Attention Models in Natural Language Processing

专知会员服务

96+阅读 · 2020年4月18日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

49+阅读 · 2019年10月17日