$L_{p}$-norm regularization schemes such as $L_{0}$, $L_{1}$, and $L_{2}$-norm regularization and $L_{p}$-norm-based regularization techniques such as weight decay and group LASSO compute a quantity which depends on model weights considered in isolation from one another. This paper describes a novel regularizer which is not based on an $L_{p}$-norm. In contrast with $L_{p}$-norm-based regularization, this regularizer is concerned with the spatial arrangement of weights within a weight matrix. This regularizer is an additive term for the loss function and is differentiable, simple and fast to compute, scale-invariant, requires a trivial amount of additional memory, and can easily be parallelized. Empirically this method yields approximately a one order-of-magnitude improvement in the number of nonzero model parameters at a given level of accuracy.
翻译:$L_{p}$-范数正则化方法,如$L_{0}$、$L_{1}$和$L_{2}$-范数正则化以及基于$L_{p}$-范数的正则化方法,例如权重衰减和组LASSO,都计算了与单独考虑的模型权重有关的量。本文描述了一种新颖的正则化方法,它不是基于$L_{p}$-范数的。与$L_{p}$-范数正则化方法不同,这种正则化方法关注的是权重矩阵内部权重的空间排列方式。这种正则化方法是损失函数的可添加项,并且可微分,简单易用、计算速度快,而且与比例无关,需要极少量的附加内存,并且容易并行化。经验上,该方法在达到相同精度的情况下,可使非零模型参数的数量提高约一个数量级。