对于给定d个属性描述的示例x=(x1,x2,......,xd),通过属性的线性组合来进行预测。一般的写法如下: f(x)=w'x+b,因此,线性模型具有很好的解释性(understandability,comprehensibility),参数w代表每个属性在回归过程中的重要程度。

VIP内容

线性模型是统计方法论的基石。统计学、生物统计学、机器学习、数据科学、计量经济学等学科的高级学生可能比其他任何工具都更应该花时间学习这门学科的细节。

在这本书中,我们对高级线性模型作了简短而严格的处理。它是先进的,在某种意义上,它是一个初级的博士生在统计学或生物统计学会看到的水平。这本书中的材料是任何统计学或生物统计学博士的标准知识。

在尝试学习这门课程之前,学生将需要相当数量的数学先决条件。首先是多元微积分和线性代数。特别是线性代数,因为线性模型的许多早期部分是线性代数结果在统计背景下的直接应用。此外,一些基于数学的基本证明是遵循证明所必需的。此外,还需要一些回归模型和数理统计。

https://leanpub.com/lm

成为VIP会员查看完整内容
0
24

最新论文

Overparameterization in deep learning is powerful: Very large models fit the training data perfectly and yet often generalize well. This realization brought back the study of linear models for regression, including ordinary least squares (OLS), which, like deep learning, shows a "double-descent" behavior: (1) The risk (expected out-of-sample prediction error) can grow arbitrarily when the number of parameters $p$ approaches the number of samples $n$, and (2) the risk decreases with $p$ for $p>n$, sometimes achieving a lower value than the lowest risk for $p<n$. The divergence of the risk for OLS can be avoided with regularization. In this work, we show that for some data models it can also be avoided with a PCA-based dimensionality reduction (PCA-OLS, also known as principal component regression). We provide non-asymptotic bounds for the risk of PCA-OLS by considering the alignments of the population and empirical principal components. We show that dimensionality reduction improves robustness while OLS is arbitrarily susceptible to adversarial attacks, particularly in the overparameterized regime. We compare PCA-OLS theoretically and empirically with a wide range of projection-based methods, including random projections, partial least squares (PLS), and certain classes of linear two-layer neural networks. These comparisons are made for different data generation models to assess the sensitivity to signal-to-noise and the alignment of regression coefficients with the features. We find that methods in which the projection depends on the training data can outperform methods where the projections are chosen independently of the training data, even those with oracle knowledge of population quantities, another seemingly paradoxical phenomenon that has been identified previously. This suggests that overparameterization may not be necessary for good generalization.

0
0
下载
预览
Top