$L_{p}$-norm regularization schemes such as $L_{0}$, $L_{1}$, and $L_{2}$-norm regularization and $L_{p}$-norm-based regularization techniques such as weight decay and group LASSO compute a quantity which de pends on model weights considered in isolation from one another. This paper describes a novel regularizer which is not based on an $L_{p}$-norm. In contrast with $L_{p}$-norm-based regularization, this regularizer is concerned with the spatial arrangement of weights within a weight matrix. This regularizer is an additive term for the loss function and is differentiable, simple and fast to compute, scale-invariant, requires a trivial amount of additional memory, and can easily be parallelized. Empirically this method yields approximately a one order-of-magnitude improvement in the number of nonzero model parameters at a given level of accuracy.
翻译:$L ⁇ p}美元-诺姆正规化方案,如$L ⁇ 0美元、$L ⁇ 1美元(美元)和$L ⁇ 2美元(美元)-诺姆正规化方案,以及以诺姆为基础的正规化技术,如重量衰减和LASSO组,计算出一个数量,在孤立地考虑的模型重量上除去折叠;本文描述了一种新颖的正规化办法,它不以美元/美元-诺尔姆为基。与美元-诺尔姆正规化方案相反,这一正规化办法涉及一个重量矩阵内的重量空间安排。这一正规化办法是损失函数的添加术语,是可变的、简单和快速的计算、缩放变异性、需要少量额外的内存,并且可以很容易地平行。这种方法在给定的精确度水平上使非零模式参数的数目有了大约一阶次的放大率改进。