Can a machine learn machine learning? We propose to answer this question using the same criteria we use to answer a similar question: can a human learn machine learning? We automatically answer MIT final exams in Introduction to Machine Learning at a human level. The course is a large undergraduate class with around five hundred students each semester. Recently, program synthesis and few-shot learning solved university-level problem set questions in mathematics and STEM courses at a human level. In this work, we solve questions from final exams that differ from problem sets in several ways: the questions are longer, have multiple parts, are more complicated, and span a broader set of topics. We provide a new dataset and benchmark of questions from eight MIT Introduction to Machine Learning final exams between Fall 2017 and Spring 2022 and provide code for automatically answering these questions and generating new questions. We perform ablation studies comparing zero-shot learning with few-shot learning, chain-of-thought prompting, GPT-3 pre-trained on text and Codex fine-tuned on code on a range of machine learning topics and find that few-shot learning methods perform best. We make our data and code publicly available for the machine learning community.
翻译:机器能学习机器吗? 我们提议用我们用来回答类似问题的相同标准来回答这个问题: 人类学习机器能学习吗? 我们自动回答机器学习入门的麻省理工学院期末考试。 课程是一个大型本科生班,每个学期有大约500名学生。 最近, 方案合成和几分学习解决大学一级的问题在数学和STEM课程中设置了问题。 在这项工作中, 我们用几种方式解决与问题组不同的最后考试问题: 问题较长, 具有多个部分, 比较复杂, 并且涉及一系列更广泛的主题。 我们从2017年秋季到2022年春季的八次机器学习入门期末考试中提供了一个新的数据集和问题基准, 并为自动回答这些问题和提出新问题提供了代码。 我们进行了一些对比研究, 将零点学习与几分学习, 思维链快速, GPT-3 对文本和代码进行了预先培训, 并发现对一系列机器学习课题的代码进行了微调, 并发现少数点学习方法最有效。 我们为机器学习社区公开了数据和代码。