通过方案综合分析解决概率和统计问题 (Solving Probability and Statistics Problems by Program Synthesis)

We solve university level probability and statistics questions by program synthesis using OpenAI's Codex, a Transformer trained on text and fine-tuned on code. We transform course problems from MIT's 18.05 Introduction to Probability and Statistics and Harvard's STAT110 Probability into programming tasks. We then execute the generated code to get a solution. Since these course questions are grounded in probability, we often aim to have Codex generate probabilistic programs that simulate a large number of probabilistic dependencies to compute its solution. Our approach requires prompt engineering to transform the question from its original form to an explicit, tractable form that results in a correct program and solution. To estimate the amount of work needed to translate an original question into its tractable form, we measure the similarity between original and transformed questions. Our work is the first to introduce a new dataset of university-level probability and statistics problems and solve these problems in a scalable fashion using the program synthesis capabilities of large language models.

翻译：我们用OpenAI的 Codex 程序合成来解决大学一级的概率和统计问题,OpenAI的 Codex是受过文字训练的变异器,对代码进行微调。我们把麻省理工学院的18.05版《概率和统计介绍》和哈佛的STAT110《概率》中的课程问题转换成编程任务。我们然后执行生成的代码以获得解决方案。由于这些课程问题基于概率,我们往往打算让Codex生成概率程序,模拟大量概率依赖来计算解决方案。我们的方法要求迅速进行工程,将问题从原来的形式转变为清晰、可移植的形式,形成一个正确的程序和解决方案。为了估计将原始问题转化为可移植的形式所需的工作量,我们测量原始问题和变异的问题之间的相似性。我们的工作是首先引入大学一级概率和统计问题的新数据集,并使用大型语言模型的程序合成能力以可缩放的方式解决这些问题。

相关内容

概率论与数理统计

关注 4

概率论与数理统计是数学的一个有特色且又十分活跃的分支，一方面，它有别开生面的研究课题，有自己独特的概念和方法，内容丰富，结果深刻;另一方面，它与其他学科又有紧密的联系，是近代数学的重要组成部分。由于它近年来突飞猛进的发展与应用的广泛性，目前已发展成为一门独立的一级学科。概率论与数理统计的理论与方法已广泛应用于工业、农业、军事和科学技术中，如预测和滤波应用于空间技术和自动控制，时间序列分析应用于石油勘测和经济管理，马尔科夫过程与点过程统计分析应用于地震预测等，同时他又向基础学科、工科学科渗透，与其他学科相结合发展成为边缘学科，这是概率论与数理统计发展的一个新趋势。（孔繁亮）

深度概率图模型，Deep Probabilistic Models

专知会员服务

29+阅读 · 2021年8月2日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

81+阅读 · 2020年7月26日

神经网络序列数据建模，229页ppt，Modeling Sequential Data with Neural Nets

专知会员服务

67+阅读 · 2020年7月25日

【变分推断课件】Lectures on Variational Inference：Statistical Analysis of Variational Approximations（附带pdf）

专知会员服务

16+阅读 · 2019年11月30日