数据并行与线程并行合一的可伸缩处理器体系结构

项目名称： 数据并行与线程并行合一的可伸缩处理器体系结构

项目编号： No.61332009

项目类型： 重点项目

立项/批准年度： 2014

项目学科： 自动化技术、计算机技术

项目作者： 唐志敏

作者单位： 中国科学院计算技术研究所

项目金额： 310万元

中文摘要： 长期以来，指令级并行、数据级并行和线程级并行是现代处理器用来提升性能的三种并行机制，然而，为了追求高性能和通用性，处理器中指令级并行、数据级并行和线程级并行的结构通常是分别配备的，导致整个芯片占用较多资源，芯片的成本增大了。资源的过度配置和资源利用率的低下也是现代高性能通用处理器功耗过高的主要原因。根据对应用模式的简单分析可知，线程并行和数据并行等多种能力并不需要同时提供，经常可能是互相交错的。所以，为不同的并行能力提供不同的功能部件，并无必要。本项目重点研究在同一套控制和运算部件上，既能支持多线程并行执行，又能支持向量数据并行计算的新型处理器体系结构。核心是利用一组可重构的深度流水数据通路，同时支持线程并行代码和数据并行代码的执行，并可根据应用需求，在两种执行模式间动态切换。这种创新的体系结构可大大提高硬件资源的利用率，从而达到高性能、低成本、低能耗和通用性的平衡。

中文关键词： 多线程结构;数据并行;向量处理;动态重构;性能每瓦

英文摘要： Over the past years, instruction-level parallelism, data parallelism, and thread-level parallelism are three main parallel mechanisms used by modern processors to improve performance. However, these mechanisms are always implemented and provided by a single processor separately in order to achieve both high performance and general-purpose usage. This caused the increase of die area and transistors occupied by the entire chip, and also the cost of the chip. Excessive resource and low efficiency of resource utilization are the main reasons for the big TDP number of modern general-purpose high-performance processors. According to simple analysis of application patterns, we know that the parallel capabilities such as thread-level parallelism, data parallelism and so on, are not need to be provided at the same time. Instead, usually they are used by turns. Therefore, it is actually not necessary to provide different functional units for different parallelism..This project mainly focuses on using only one control and computing logic to build new processor architectures, which could not only support thread level parallelism execution, but also support vector data processing and computing. The key point is using one deep pipeline data path which can be reconfigured to support execution of both thread-level parallelism code and data parallelism code. And it can switch between the two modes adapting to different requirements of applications. The new architecture can dramatically improve the utilization rate of hardware resources, and achieve a better balance among high performance, low cost, low power, and general-purpose usage.

英文关键词： multi-threaded architecture;data parallelism;vector processing;dynamic reconfiguration;performance per watt

成为VIP会员查看完整内容