Aggregating data in a database could also be called "integrating along fibers": given functions $\pi\colon E\to D$ and $s\colon E\to R$, where $(R,\circledast)$ is a commutative monoid, we want a new function $(\circledast s)_\pi$ that sends each $d\in D$ to the "sum" of all $s(e)$ for which $\pi(e)=d$. The operation lives alongside querying -- or more generally data migration -- in typical database usage: one wants to know how much Canadians spent on cell phones in 2021, for example, and such requests typically require both aggregation and querying. But whereas querying has an elegant category-theoretic treatment in terms of parametric right adjoints between copresheaf categories, a categorical formulation of aggregation -- especially one that lives alongside that for querying -- appears to be completely absent from the literature. In this paper we show how both querying and aggregation fit into the "polynomial ecosystem". Starting with the category $\mathbf{Poly}$ of polynomial functors in one variable, we review the relatively recent results of Ahman-Uustalu and Garner, which showed that the framed bicategory $\mathbb{C}\mathbf{at}^\sharp$ of comonads in $\mathbf{Poly}$ is precisely the right setting for data migration: its objects are categories and its bicomodules are parametric right adjoints between their copresheaf categories. We then develop a great deal of theory, compressed for space reasons, including local monoidal closed structures, a coclosure to bicomodule composition, and an understanding of adjoints in $\mathbb{C}\mathbf{at}^\sharp$. Doing so allows us to derive interesting mathematical results, e.g.\ that the ordinary operation of transposing a span can be decomposed into the composite of two more primitive operations, and then finally to explain how aggregation arises, alongside querying, in $\mathbb{C}\mathbf{at}^\sharp$.
翻译:数据库中的聚合数据也可以被称为“ 纤维整合 ” : 给的函数 $\ pi\ croom E\ D$ 和 $s\ colom E\ to R$, 其中$( R,\ circledast) 是一个逗号单项, 我们需要一个新的函数 $( cciledast s)\\\ pípi$, 将每个美元( e) 发送到“ 和” 美元( e) 的“ 和 美元( e) ” 。 在典型的数据库使用中, 运行同时查询类别 -- -- 或更一般的数据迁移类别 -- -- 例如: 人们想知道多少加拿大人在2021年的手机上花费了多少, 而这样的请求通常需要同时汇总和查询。 但是, 查询有一个优雅的分类, 将每张量的右( ccrele) commisseqoura) 的直径组合, 特别是用于查询的缩略号 。 在本文中, 我们的右键显示如何查询和组合在“ yolomal $ 生态系统” 。