Spreadsheets are widely used for table manipulation and presentation. Stylistic formatting of these tables is an important property for both presentation and analysis. As a result, popular spreadsheet software, such as Excel, supports automatically formatting tables based on data-dependent rules. Unfortunately, writing these formatting rules can be challenging for users as that requires knowledge of the underlying rule language and data logic. In this paper, we present CORNET, a neuro-symbolic system that tackles the novel problem of automatically learning such formatting rules from user examples of formatted cells. CORNET takes inspiration from inductive program synthesis and combines symbolic rule enumeration, based on semi-supervised clustering and iterative decision tree learning, with a neural ranker to produce conditional formatting rules. To motivate and evaluate our approach, we extracted tables with formatting rules from a corpus of over 40K real spreadsheets. Using this data, we compared CORNET to a wide range of symbolic and neural baselines. Our results show that CORNET can learn rules more accurately, across varying conditions, compared to these baselines. Beyond learning rules from user examples, we present two case studies to motivate additional uses for CORNET: simplifying user conditional formatting rules and recovering rules even when the user may have manually formatted their data.
翻译:电子表格被广泛用于表格操作和演示。 这些表格的立体格式化是演示和分析的重要属性。 因此, Excel 等受欢迎的电子表格软件支持基于数据依赖规则的自动格式化表格。 不幸的是, 撰写这些格式化规则对于用户来说可能具有挑战性, 因为需要了解基本规则语言和数据逻辑。 在本文中, 我们展示了CORNET, 这是一种神经- 共线系统, 解决从格式化单元格的用户示例中自动学习这种格式化规则的新问题。 CORNET 的灵感来自启动程序合成, 并结合了基于半监督组合和迭代决策树学习的象征性规则查点, 其基础是神经级排序, 以产生有条件的格式化规则。 为了激励和评估我们的方法, 我们从40K 以上真实电子表格的集合中提取了带有格式化规则的表格。 我们用这些数据将CORNET 与广泛的象征性和神经基线进行比较。 我们的结果表明, CORNET 能够比这些基线更精确地学习规则, 。 除了从用户示例中学习规则外, 我们还提出两个案例研究, 以恢复用户规则, 来激励使用CORNET 的格式。