We present two related Stata modules, r_ml_stata and c_ml_stata, for fitting popular Machine Learning (ML) methods both in regression and classification settings. Using the recent Stata/Python integration platform (sfi) of Stata 16, these commands provide hyper-parameters' optimal tuning via K-fold cross-validation using greed search. More specifically, they make use of the Python Scikit-learn API to carry out both cross-validation and outcome/label prediction.
翻译:我们提出了两个相关的 Stata 模块, 即 r_ml_ stata 和 c_ml_stata, 用于在回归和分类设置中安装流行机器学习(ML) 方法。 这些指令使用最近 Stata 16 的 Stata/ Python 整合平台( sfi), 利用贪婪搜索, 通过 K 倍交叉校验提供超参数的最佳调试。 更具体地说, 它们利用 Python Sciikt-learn API 进行交叉验证和结果/ 标签预测 。