Recent approaches in the theoretical analysis of model-based deep learning architectures have studied the convergence of gradient descent in shallow ReLU networks that arise from generative models whose hidden layers are sparse. Motivated by the success of architectures that impose structured forms of sparsity, we introduce and study a group-sparse autoencoder that accounts for a variety of generative models, and utilizes a group-sparse ReLU activation function to force the non-zero units at a given layer to occur in blocks. For clustering models, inputs that result in the same group of active units belong to the same cluster. We proceed to analyze the gradient dynamics of a shallow instance of the proposed autoencoder, trained with data adhering to a group-sparse generative model. In this setting, we theoretically prove the convergence of the network parameters to a neighborhood of the generating matrix. We validate our model through numerical analysis and highlight the superior performance of networks with a group-sparse ReLU compared to networks that utilize traditional ReLUs, both in sparse coding and in parameter recovery tasks. We also provide real data experiments to corroborate the simulated results, and emphasize the clustering capabilities of structured sparsity models.
翻译:在对基于模型的深层学习结构进行理论分析的最近方法中,研究了从基因模型中产生的、隐蔽层稀疏的浅ReLU网络中的梯度下降的趋同性关系。我们以将结构化的聚变形式强加于人的建筑的成功为动力,引入并研究一个可解释各种基因模型的群状自动编码器,并利用一个群状ReLU激活功能迫使某一层的非零单位在区块内出现。对于集成模型,产生同一组活跃单位的同一组群属于同一组群的输入物属于同一组群。我们还着手分析一个拟议自定义器浅体的浅体的梯度动态,该样体经过培训,其数据符合一个组状的基因化模型。在这种环境下,我们从理论上证明网络参数与生成矩阵的周围相融合。我们通过数字分析来验证我们的模型,并突出与使用群状质ReLU的网络的优性能,与使用传统的RELU的网络相比,在稀薄的和参数恢复任务中都是相同的。我们还提供真实的数据实验,以证实模拟结果,并强调结构组合模型的能力。