The yaglm package aims to make the broader ecosystem of modern generalized linear models accessible to data analysts and researchers. This ecosystem encompasses a range of loss functions (e.g. linear, logistic, quantile regression), constraints (e.g. positive, isotonic) and penalties. Beyond the basic lasso/ridge, the package supports structured penalties such as the nuclear norm as well as the group, exclusive, fused, and generalized lasso. It also supports more accurate adaptive and non-convex (e.g. SCAD) versions of these penalties that often come with strong statistical guarantees at limited additional computational expense. yaglm comes with a variety of tuning parameter selection methods including: cross-validation, information criteria that have favorable model selection properties, and degrees of freedom estimators. While several solvers are built in (e.g. FISTA), a key design choice allows users to employ their favorite state of the art optimization algorithms. Designed to be user friendly, the package automatically creates tuning parameter grids, supports tuning with fast path algorithms along with parallelization, and follows a unified scikit-learn compatible API.
翻译:yaglm 软件包旨在使数据分析家和研究人员能够使用更广泛的现代通用线性模型生态系统数据分析器和研究人员,这一生态系统包括一系列损失功能(如线性、后勤性、量性回归)、限制(如正、异质)和惩罚。除了基本的 lasso/ ridge 外,软件包支持结构化处罚,如核规范以及组合、独家、引信和普世性拉索。软件包还支持更准确的适应性和非中央化的处罚(如SCAD)版本,这些处罚往往在有限的额外计算费用上带来强有力的统计保证。 软件包包含各种调制参数选择方法, 包括: 交叉校准、 具有有利的模型选择属性的信息标准以及自由估计度。 虽然在( 如FISTA) 中建起了几个解决方案, 但关键设计选择允许用户使用他们最喜欢的艺术优化算法状态。 设计为方便用户, 软件包自动创建调控参数网, 支持与平行化一起以快速路径算法进行调控, 并遵循统一的 SI- alia 。