Tensor Processing Units (TPUs) are specialized hardware accelerators for deep learning developed by Google. This paper explores the performance of TPU with a focus on AI and its implementation in edge computing. It first provides an overview of TPUs, specifically their design in relation to neural networks, their general architecture, compilation techniques and supporting frameworks. Furthermore, we provide a comparative analysis of Cloud and Edge TPU performance against other counterpart chip architectures. It is then discussed how TPUs can be used to speed up AI workloads. The results show that TPUs can provide significant performance improvements both in cloud and edge computing. Additionally, we address the need for further research for the deployment of more architectures in the Edge TPU, as well as the need for the development of more robust comparisons in edge computing.
翻译:暂无翻译