MCAD:通过差别和指示一级的追踪,超越基本-闭锁输送量估计 (MCAD: Beyond Basic-Block Throughput Estimation Through Differential, Instruction-Level Tracing)

Estimating instruction-level throughput is critical for many applications: multimedia, low-latency networking, medical, automotive, avionic, and industrial control systems all rely on tightly calculable and accurate timing bounds of their software. Unfortunately, how long a program may run - or if it may indeed stop at all - cannot be answered in the general case. This is why state-of-the-art throughput estimation tools usually focus on a subset of operations and make several simplifying assumptions. Correctly identifying these sets of constraints and regions of interest in the program typically requires source code, specialized tools, and dedicated expert knowledge. Whenever a single instruction is modified, this process must be repeated, incurring high costs when iteratively developing timing sensitive code in practice. In this paper, we present MCAD, a novel and lightweight timing analysis framework that can identify the effects of code changes on the microarchitectural level for binary programs. MCAD provides accurate differential throughput estimates by emulating whole program execution using QEMU and forwarding traces to LLVM for instruction-level analysis. This allows developers to iterate quickly, with low overhead, using common tools: identifying execution paths that are less sensitive to changes over timing-critical paths only takes minutes within MCAD. To the best of our knowledge this represents an entirely new capability that reduces turnaround times for differential throughput estimation by several orders of magnitude compared to state-of-the-art tools. Our detailed evaluation shows that MCAD scales to real-world applications like FFmpeg and Clang with millions of instructions, achieving < 3% geo mean error compared to ground truth timings from hardware-performance counters on x86 and ARM machines.

翻译：对许多应用而言,估算指令水平的吞吐量至关重要:多媒体、低纬度网络、医疗、汽车、航空和工业控制系统都依赖于软件的严格可计算和准确的定时框。不幸的是,一般情况下,一个程序可能运行了多久,或者如果它可能停止了多久,无法回答。这就是为什么最先进的吞吐量估算工具通常侧重于一个操作组,并作出一些简化的假设。正确确定这些对程序感兴趣的制约和指示区域通常需要源代码、专门工具和专门专家知识。每当一项指令被修改时,这一过程必须重复重复,在反复制定对时间敏感的软件的定时框时圈时,这一过程将带来高昂的成本。在本文件中,我们介绍一个新的和轻量化的时间分析框架,它能够确定代码变化对二进制方案的微分解水平的影响。 MCAD通过模拟整个程序执行量值工具的缩影和向LLLVM发送的痕迹,这只能让开发者快速地对错误进行比较,而低调的中位路路段则显示我们最不那么高的中间的工具。