Modern compilers leverage block coverage profile data to carry out downstream profile-guided optimizations to improve the runtime performance and the size of a binary. Given a control-flow graph $G=(V, E)$ of a function in the binary, where nodes in $V$ correspond to basic blocks (sequences of instructions that are always executed sequentially) and edges in $E$ represent jumps in the control flow, the goal is to know for each block $u \in V$ whether $u$ was executed during a session. To this end, extra instrumentation code that records when a block is executed needs to be added to the binary. This extra code creates a time and space overhead, which one would like to minimize as much as possible. Motivated by this application, we study the Minimum Coverage Instrumentation problem, where the goal is to find a minimum size subset of blocks to instrument such that the coverage of the remaining blocks in the graph can be inferred from the coverage status of the instrumented subset. Our main result is an algorithm to find an optimal instrumentation strategy and to carry out the inference in $O(|E|)$ time. We also study variants of this basic problem in which we are interested in learning the coverage of edges instead of the nodes, or when we are only allowed to instrument edges instead of the nodes.
翻译:现代编译器利用区块覆盖配置数据来进行下游剖面图引导优化, 以提高运行时间性能和二进制的大小。 鉴于二进制函数的控制流图形$G=( V, E)$( 美元), 美元中的节点对应基本块( 指示序列, 总是按顺序执行), 美元中的边缘代表控制流程中的跳跃, 目标是为每个区块了解 $u + 美元 ( 美元) 是否在会话中执行 。 为此, 需要将块执行时记录的额外仪表代码添加到二进制中。 这个额外代码将创造时间和空间管理器, 以尽可能最大限度地减少。 我们受此应用的驱动, 我们研究最小范围仪表的最小大小组分集, 以便从仪器子集的覆盖状态中可以推断出其余块的覆盖范围。 我们的主要结果是一种算法, 以找到一个最佳的仪表策略, 并且将一个最小的边框移, 也就是我们无法在基本的边框中学习 $( ° ) 。