Aggregation queries are a series of computationally-demanding analytics operations on grouped and time series data. They include tasks such as summation or finding the median among the items of a group sharing a group ID, and within a specified number of the last observed tuples for sliding window aggregation (SWAG). They have a wide range of applications including in database analytics, operating systems, bank security and medical sensors. Existing challenges include the hardware complexity that comes with efficiently handling per-group states using hash-based approaches. This paper presents a pipelined and adaptable approach for calculating a wide range of aggregation queries with high throughput. It is then adapted for SWAG to achieve up to 476x speedup over the CPU of the same platform. It outperforms the state-of-the-art such as by being able to process 7.14x more tuples per second, and support 4x the window sizes with a fraction of the resources and no DRAM.
翻译:暂无翻译