Statistical learning methods have been growing in popularity in recent years. Many of these procedures have parameters that must be tuned for models to perform well. Research has been extensive in neural networks, but not for many other learning methods. We looked at the behavior of tuning parameters for support vector machines, gradient boosting machines, and adaboost in both a classification and regression setting. We used grid search to identify ranges of tuning parameters where good models can be found across many different datasets. We then explored different optimization algorithms to select a model across the tuning parameter space. Models selected by the optimization algorithm were compared to the best models obtained through grid search to select well performing algorithms. This information was used to create an R package, EZtune, that automatically tunes support vector machines and boosted trees.
翻译:近些年来,统计学习方法越来越受欢迎。许多这些程序都有必须调整的参数,才能使模型运行良好。在神经网络中,研究范围很广,但对于其他许多学习方法,研究范围并不广泛。我们在分类和回归设置中研究了辅助矢量机、梯度助推机和助推机调试参数的行为。我们用网格搜索来确定调试参数的范围,在许多不同的数据集中可以找到良好的模型。然后我们探索了不同的优化算法,以便在调控参数空间中选择一个模型。优化算法所选取的模型与通过电网搜索获得的最佳模型进行了比较,以选择运行良好的算法。这一信息被用来创建一个R包,即EZtune,自动调控算矢量机和树。</s>