革命神经网:基础、计算和新应用 (Convolutional Neural Nets: Foundations, Computations, and New Applications)

We review mathematical foundations of convolutional neural nets (CNNs) with the goals of: i) highlighting connections with techniques from statistics, signal processing, linear algebra, differential equations, and optimization, ii) demystifying underlying computations, and iii) identifying new types of applications. CNNs are powerful machine learning models that highlight features from grid data to make predictions (regression and classification). The grid data object can be represented as vectors (in 1D), matrices (in 2D), or tensors (in 3D or higher dimensions) and can incorporate multiple channels (thus providing high flexibility in the input data representation). For example, an image can be represented as a 2D grid data object that contains red, green, and blue (RBG) channels (each channel is a 2D matrix). Similarly, a video can be represented as a 3D grid data object (two spatial dimensions plus time) with RGB channels (each channel is a 3D tensor). CNNs highlight features from the grid data by performing convolution operations with different types of operators. The operators highlight different types of features (e.g., patterns, gradients, geometrical features) and are learned by using optimization techniques. In other words, CNNs seek to identify optimal operators that best map the input data to the output data. A common misconception is that CNNs are only capable of processing image or video data but their application scope is much wider; specifically, datasets encountered in diverse applications can be expressed as grid data. Here, we show how to apply CNNs to new types of applications such as optimal control, flow cytometry, multivariate process monitoring, and molecular simulations.

翻译：我们用以下目标来审查进化神经网(CNNs)的数学基础:(一) 突出与来自统计、信号处理、线性代数、差异方程和优化的技术之间的联系;(二) 解开基础计算,以及(三) 确定新的应用类型。CNN是强大的机器学习模型,突出网格数据特征,从网格数据到预测(回归和分类)。网格数据对象可以作为矢量(1D)、矩阵(2D)或电源(3D或3D以上层面),并可以包含多个渠道(因此投入数据表示的灵活性很高)。例如,图像可以作为包含红色、绿色和蓝色(RBG)通道的2D网格数据对象(每个频道是一个2D矩阵矩阵)。同样,视频也可以作为3D网格数据对象(两个空间维度加时间),RGB频道(每个频道为3DShoororor),CNN只能通过与不同类型的操作进行电网格数据模拟来显示网格数据的特征。操作者强调不同类型的应用程序(例如,绿色、绿色和蓝色(RISM)应用模式模式模式,通过最优化的数据显示数据输出,我们所学习的深度数据为最优化数据。