We present CFU Playground, a full-stack open-source framework that enables rapid and iterative design of machine learning (ML) accelerators for embedded ML systems. Our toolchain tightly integrates open-source software, RTL generators, and FPGA tools for synthesis, place, and route. This full-stack development framework gives engineers access to explore bespoke architectures that are customized and co-optimized for embedded ML. The rapid, deploy-profile-optimization feedback loop lets ML hardware and software developers achieve significant returns out of a relatively small investment in customization. Using CFU Playground's design loop, we show substantial speedups (55x-75x) and design space exploration between the CPU and accelerator.
翻译:我们展示了CFU游戏场,这是一个完整的开放源码框架,可以快速和反复设计嵌入 ML 系统的机器学习加速器。我们的工具链紧密整合了开放源码软件、RTL 生成器和FPGA合成、地点和路线工具。这个全堆开发框架让工程师可以探索嵌入 ML 的定制和最佳组合结构。 快速、部署式的配置式优化反馈环让 ML 硬件和软件开发者从相对较小的定制化投资中获得重大回报。 我们使用 CFU 游戏场的设计环,展示了巨大的超速(55x-75x),并设计了CPU和加速器之间的空间探索。