In this work, we announce a comprehensive well curated and opensource dataset with millions of samples for pre-college and college level problems in mathematicsand science. A preliminary set of results using transformer architecture with character to character encoding is shown. The dataset identifies some challenging problem and invites research on better architecture search
翻译:在这项工作中,我们宣布建立一个全面、完善和开放源码数据集,有数百万个样本,用于研究大学前和大学一级数学和科学的数学和科学问题。展示了一套初步结果,其中使用了具有字符编码特点的变压器结构。数据集查明了一些具有挑战性的问题,请研究更好的建筑搜索。