Context: Tables are ubiquitous formats for data. Therefore, techniques for writing correct programs over tables, and debugging incorrect ones, are vital. Our specific focus in this paper is on rich types that articulate the properties of tabular operations. We wish to study both their expressive power and _diagnostic quality_. Inquiry: There is no "standard library" of table operations. As a result, every paper (and project) is free to use its own (sub)set of operations. This makes artifacts very difficult to compare, and it can be hard to tell whether omitted operations were left out by oversight or because they cannot actually be expressed. Furthermore, virtually no papers discuss the quality of type error feedback. Approach: We combed through several existing languages and libraries to create a "standard library" of table operations. Each entry is accompanied by a detailed specification of its "type," expressed independent of (and hence not constrained by) any type language. We also studied and categorized a corpus of (student) program edits that resulted in table-related errors. We used this to generate a suite of erroneous programs. Finally, we adapted the concept of a datasheet to facilitate comparisons of different implementations. Knowledge: Our benchmark creates a common ground to frame work in this area. Language designers who claim to support typed programming over tables have a clear suite against which to demonstrate their system's expressive power. Our family of errors also gives them a chance to demonstrate the quality of feedback. Researchers who improve one aspect -- especially error reporting -- without changing the other can demonstrate their improvement, as can those who engage in trade-offs between the two. The net result should be much better science in both expressiveness and diagnostics. We also introduce a datasheet format for presenting this knowledge in a methodical way. Grounding: We have generated our benchmark from real languages, libraries, and programs, as well as personal experience conducting and teaching data science. We have drawn on experience in engineering and, more recently, in data science to generate the datasheet. Importance: Claims about type support for tabular programming are hard to evaluate. However, tabular programming is ubiquitous, and the expressive power of type systems keeps growing. Our benchmark and datasheet can help lead to more orderly science. It also benefits programmers trying to choose a language.
翻译:上下文 : 表格是数据无处不在的格式 。 因此, 在表格上写正确程序和调试不正确的语言时, 我们的具体焦点非常关键 。 本文中我们的具体焦点是显示表格操作特性的丰富类型。 我们希望研究它们的表达力和诊断质量。 调查: 没有“ 标准库” 的表格操作。 结果, 每个纸张( 和工程) 都可以使用自己的( 子) 操作设置 。 因此, 很难比较艺术( 子), 并且很难辨别未删除的动作是否为监督所遗漏的或者它们无法表达 。 此外, 我们实际上没有文件讨论类型错误反馈的质量 。 方法 : 我们通过现有的几种语言和图书馆来创建表格操作的“ 标准库 ” 。 每个条目都有一个详细的“ 类型”, 独立于( 而不是受任何类型语言制约 ) 。 我们从一个( ) ( ) 预言) 程序到一个( 预言) 的表达器( ), 我们也可以从列表错误中做出一个( ) 预言) 编辑程序 。 我们用这个系统来做出一个错误 。 我们用一个错误来生成一个错误 。 我们用一个错误来做一个错误 。 我们用一个错误 。 。 我们用一个错误来做一个错误来做一个错误 。