To fit sparse linear associations, a LASSO sparsity inducing penalty with a single hyperparameter provably allows to recover the important features (needles) with high probability in certain regimes even if the sample size is smaller than the dimension of the input vector (haystack). More recently learners known as artificial neural networks (ANN) have shown great successes in many machine learning tasks, in particular fitting nonlinear associations. Small learning rate, stochastic gradient descent algorithm and large training set help to cope with the explosion in the number of parameters present in deep neural networks. Yet few ANN learners have been developed and studied to find needles in nonlinear haystacks. Driven by a single hyperparameter, our ANN learner, like for sparse linear associations, exhibits a phase transition in the probability of retrieving the needles, which we do not observe with other ANN learners. To select our penalty parameter, we generalize the universal threshold of Donoho and Johnstone (1994) which is a better rule than the conservative (too many false detections) and expensive cross-validation. In the spirit of simulated annealing, we propose a warm-start sparsity inducing algorithm to solve the high-dimensional, non-convex and non-differentiable optimization problem. We perform precise Monte Carlo simulations to show the effectiveness of our approach.
翻译:为了适应稀疏的线性协会,LASSO的宽度诱发惩罚,使用单一的超光度计,可以看似允许在某些制度下恢复重要特征(需要),尽管抽样规模小于输入矢量(Haystack)的尺寸,在某些制度中可能性很高。最近,被称为人工神经网络(ANN)的学习者在许多机器学习任务中表现出了巨大的成功,特别是适当的非线性协会。小型学习率、随机梯度下游算法和大型培训有助于应对深神经网络现有参数数目的爆炸。然而,没有多少ANN学习者被开发和研究在非线性干草堆中找到针。在单一的超光度计的驱动下,我们的ANNE学习者,如对分散的线性协会一样,在重现针的概率方面出现了一个阶段性转变,我们不与其他ANNE学习者一起观察。为了选择我们的处罚参数,我们把Donoho和Johnstone(1994年)的通用阈值,这比保守性(太多的假探测)和昂贵的交叉估价方法要好得多。我们建议了一个高温度的升级的升级的模型,我们开始一个不精确的模拟的模拟的轨道,我们不升级的模拟的轨道。