This paper introduces a new paradigm of chip design for the semi-conductor industry called Data-Rich Analytics Based Computer Architecture (BRYT). The goal is to enable monitoring chip hardware behavior in the field, at real-time speeds with no slowdowns, with minimal power overheads and obtain insights on chip behavior and workloads. The paradigm is motivated by the end of Moore's Law and Dennard Scaling which necessitates architectural efficiency as the means for improved capability for the next decade or two. This paper implements the first version of the paradigm with a system architecture and the concept of an analYtics Processing Unit (YPU). We perform 4 case studies, and implement an RTL level prototype. Across the case studies we show a YPU with area overhead <3% at 7nm, and overall power consumption of <25 mW is able to create previously inconceivable data PICS stacks of arbitrary programs, evaluating instruction prefetchers in the wild before deployment, fine-grained cycle-by-cycle utilization of hardware modules, and histograms of tensor-value distributions of DL models.
翻译:暂无翻译