We solve university level probability and statistics questions by program synthesis using OpenAI's Codex, a Transformer trained on text and fine-tuned on code. We transform course problems from MIT's 18.05 Introduction to Probability and Statistics and Harvard's STAT110 Probability into programming tasks. We then execute the generated code to get a solution. Since these course questions are grounded in probability, we often aim to have Codex generate probabilistic programs that simulate a large number of probabilistic dependencies to compute its solution. Our approach requires prompt engineering to transform the question from its original form to an explicit, tractable form that results in a correct program and solution. To estimate the amount of work needed to translate an original question into its tractable form, we measure the similarity between original and transformed questions. Our work is the first to introduce a new dataset of university-level probability and statistics problems and solve these problems in a scalable fashion using the program synthesis capabilities of large language models.
翻译:我们用OpenAI的 Codex 程序合成来解决大学一级的概率和统计问题,OpenAI的 Codex是受过文字训练的变异器,对代码进行微调。我们把麻省理工学院的18.05版《概率和统计介绍》和哈佛的STAT110《概率》中的课程问题转换成编程任务。我们然后执行生成的代码以获得解决方案。由于这些课程问题基于概率,我们往往打算让Codex生成概率程序,模拟大量概率依赖来计算解决方案。我们的方法要求迅速进行工程,将问题从原来的形式转变为清晰、可移植的形式,形成一个正确的程序和解决方案。为了估计将原始问题转化为可移植的形式所需的工作量,我们测量原始问题和变异的问题之间的相似性。我们的工作是首先引入大学一级概率和统计问题的新数据集,并使用大型语言模型的程序合成能力以可缩放的方式解决这些问题。