The orthogonality constraints, including the hard and soft ones, have been used to normalize the weight matrices of Deep Neural Network (DNN) models, especially the Convolutional Neural Network (CNN) and Vision Transformer (ViT), to reduce model parameter redundancy and improve training stability. However, the robustness to noisy data of these models with constraints is not always satisfactory. In this work, we propose a novel two-stage approximately orthogonal training framework (TAOTF) to find a trade-off between the orthogonal solution space and the main task solution space to solve this problem in noisy data scenarios. In the first stage, we propose a novel algorithm called polar decomposition-based orthogonal initialization (PDOI) to find a good initialization for the orthogonal optimization. In the second stage, unlike other existing methods, we apply soft orthogonal constraints for all layers of DNN model. We evaluate the proposed model-agnostic framework both on the natural image and medical image datasets, which show that our method achieves stable and superior performances to existing methods.
翻译:矩形限制,包括硬性和软性限制,已经用于使深神经网络模型,特别是进化神经网络和愿景变异器模型的重量矩阵正常化,以减少模型参数冗余,提高培训稳定性。然而,这些模型的紧张数据并非总能令人满意。在这项工作中,我们提议了一个新型的两阶段近正方正方形培训框架(TAOTF),以找到正方形解决方案空间和主要任务解决方案空间之间的平衡,从而在繁忙的数据假设中解决这一问题。在第一阶段,我们提出一种叫作极地分解定位基于或图态初始化的新型算法(PDOI),以寻找出一个良好的矩形优化初始化方法。在第二阶段,我们与其他现有方法不同,对DNN模型的所有层次都采用软或分层限制。我们评估了在自然图像和医学图像数据集上拟议的模型-异性框架,这显示我们的方法在现行方法上取得了稳定和优异性。