数据大小对自动日志 Scorsing 引擎的影响 (The effects of data size on Automated Essay Scoring engines)

We study the effects of data size and quality on the performance on Automated Essay Scoring (AES) engines that are designed in accordance with three different paradigms; A frequency and hand-crafted feature-based model, a recurrent neural network model, and a pretrained transformer-based language model that is fine-tuned for classification. We expect that each type of model benefits from the size and the quality of the training data in very different ways. Standard practices for developing training data for AES engines were established with feature-based methods in mind, however, since neural networks are increasingly being considered in a production setting, this work seeks to inform us as to how to establish better training data for neural networks that will be used in production.

翻译：我们研究数据大小和质量对自动测读引擎性能的影响,这些引擎是根据三种不同模式设计的; 频率和手制地物模型,经常性神经网络模型,以及经过精细调整以进行分类的预先训练的变压器语言模型; 我们期望每一种模型都以非常不同的方式从培训数据的规模和质量中获益; 但是,由于神经网络在生产环境中日益受到考虑,因此,我们用基于地物的方法制定了开发AES发动机培训数据的标准做法,这项工作旨在告诉我们如何为将被用于生产中的神经网络建立更好的培训数据。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

【干货书】实体搜索，Entity-Oriented Search，358页pdf

专知会员服务

35+阅读 · 2021年4月9日

【干货书】Python简洁代码第二版，422页pdf，Clean Code in Python, 2nd Edition

专知会员服务

37+阅读 · 2021年1月15日

迁移学习简明教程，11页ppt

专知会员服务

108+阅读 · 2020年8月4日