建立有效的科学机构学习模式:多学科视角 (Constructing Effective Machine Learning Models for the Sciences: A Multidisciplinary Perspective)

Learning from data has led to substantial advances in a multitude of disciplines, including text and multimedia search, speech recognition, and autonomous-vehicle navigation. Can machine learning enable similar leaps in the natural and social sciences? This is certainly the expectation in many scientific fields and recent years have seen a plethora of applications of non-linear models to a wide range of datasets. However, flexible non-linear solutions will not always improve upon manually adding transforms and interactions between variables to linear regression models. We discuss how to recognize this before constructing a data-driven model and how such analysis can help us move to intrinsically interpretable regression models. Furthermore, for a variety of applications in the natural and social sciences we demonstrate why improvements may be seen with more complex regression models and why they may not.

翻译：从数据中学习已导致许多学科取得重大进步,包括文字和多媒体搜索、语音识别和自动车辆导航。机器学习能够使自然科学和社会科学出现类似的飞跃吗?这当然是许多科学领域的期望,近年来,许多非线性模型都应用到广泛的数据集中。然而,在将变量之间的变异和相互作用手工添加到线性回归模型中,灵活的非线性解决方案并不总是会得到改善。我们讨论了在建立数据驱动模型之前如何认识到这一点,以及这种分析如何帮助我们转向内在的可解释回归模型。此外,对于自然和社会科学中的各种应用,我们展示了为什么可以用更复杂的回归模型来看待改进,以及为什么它们可能不会看到改进。

相关内容

MoDELS

关注 0

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/