LeanML:在机器学习项目中减少可避免废物的设计模式 (LeanML: A Design Pattern To Slash Avoidable Wastes in Machine Learning Projects)

We introduce the first application of the lean methodology to machine learning projects. Similar to lean startups and lean manufacturing, we argue that lean machine learning (LeanML) can drastically slash avoidable wastes in commercial machine learning projects, reduce the business risk in investing in machine learning capabilities and, in so doing, further democratize access to machine learning. The lean design pattern we propose in this paper is based on two realizations. First, it is possible to estimate the best performance one may achieve when predicting an outcome $y \in \mathcal{Y}$ using a given set of explanatory variables $x \in \mathcal{X}$, for a wide range of performance metrics, and without training any predictive model. Second, doing so is considerably easier, faster, and cheaper than learning the best predictive model. We derive formulae expressing the best $R^2$, MSE, classification accuracy, and log-likelihood per observation achievable when using $x$ to predict $y$ as a function of the mutual information $I\left(y; x\right)$, and possibly a measure of the variability of $y$ (e.g. its Shannon entropy in the case of classification accuracy, and its variance in the case regression MSE). We illustrate the efficacy of the LeanML design pattern on a wide range of regression and classification problems, synthetic and real-life.

翻译：我们引入了对机器学习项目的第一项精度方法应用。类似精度初创和精度制造,我们认为精度机学习(LeanML)可以大幅削减商业机器学习项目中可避免的废物,降低投资于机器学习能力的商业风险,从而进一步使机器学习机会民主化。我们在本文件中提议的精度设计模式基于两个认识。首先,在预测结果时,利用一套特定的解释变量($x y;x\in mathcal{Y}$)来估计最佳绩效是可能的。对于一系列广泛的性能衡量标准,而没有培训任何预测模型,精度学习机器学习能力,从而降低企业在投资机器学习能力方面的商业风险。我们在本文件中提议的精度设计模式是基于两种认识。首先,在使用美元作为相互信息分类函数($xlefleft(y);x\right)$(xright=xxxxxxxxxxxcalx}X}x}x}美元时,可以估计最佳绩效。第二,这样做比学习到学习最佳预测模型模型的精确度的精确度,我们得出了其精确度的精确度的模型。

相关内容

Machine Learning

关注 2240

机器学习（Machine Learning）是一个研究计算学习方法的国际论坛。该杂志发表文章，报告广泛的学习方法应用于各种学习问题的实质性结果。该杂志的特色论文描述研究的问题和方法，应用研究和研究方法的问题。有关学习问题或方法的论文通过实证研究、理论分析或与心理现象的比较提供了坚实的支持。应用论文展示了如何应用学习方法来解决重要的应用问题。研究方法论文改进了机器学习的研究方法。所有的论文都以其他研究人员可以验证或复制的方式描述了支持证据。论文还详细说明了学习的组成部分，并讨论了关于知识表示和性能任务的假设。官网地址：http://dblp.uni-trier.de/db/journals/ml/

深度学习优化算法，73页ppt，Optimization Algorithms on Deep Learning

专知会员服务

135+阅读 · 2021年6月16日

【经典书】机器学习黑客秘笈(Machine Learning for Hackers)，322页pdf

专知会员服务

46+阅读 · 2021年2月8日

专知会员服务

39+阅读 · 2020年11月3日

【机器学习工具箱(机器学习实用库分类大列表)】《Machine Learning Toolbox》by Amit Chaudhary

专知会员服务

30+阅读 · 2020年7月12日

【开放书】贝叶斯推理与机器学习，690页pdf，Bayesian Reasoning and Machine Learning