Regularization is a popular technique to solve the overfitting problem of machine learning algorithms. Most regularization technique relies on parameter selection of the regularization coefficient. Plug-in method and cross-validation approach are two most common parameter selection approaches for regression methods such as Ridge Regression, Lasso Regression and Kernel Regression. Matrix factorization based recommendation system also has heavy reliance on the regularization technique. Most people select a single scalar value to regularize the user feature vector and item feature vector independently or collectively. In this paper, we prove that such approach of selecting regularization coefficient is invalid, and we provide a theoretically accurate method that outperforms the most widely used approach in both accuracy and fairness metrics.
翻译:正规化是解决机器学习算法问题的一种流行技术。 多数正规化技术依赖于正规化系数的参数选择。 插入法和交叉验证法是回归法中最常见的两种最常见的参数选择方法,如Ridge Regrestition、Lasso Regresion和中子回归法。 以矩阵化为根据的建议系统也严重依赖正规化技术。 大多数人选择单一的标值来独立或集体地规范用户特性矢量和物品特性矢量。 在本文中,我们证明选择正规化系数的方法是无效的,我们提供了一种理论上准确的方法,在精确度和公平度指标上都超过了最广泛使用的方法。