Java is the "go-to" programming language choice for developing scalable enterprise cloud applications. In such systems, even a few percent CPU time savings can offer a significant competitive advantage and cost saving. Although performance tools abound in Java, those that focus on the data locality in the memory hierarchy are rare. In this paper, we present DJXPerf, a lightweight, object-centric memory profiler for Java, which associates memory-hierarchy performance metrics (e.g., cache/TLB misses) with Java objects. DJXPerf uses statistical sampling of hardware performance monitoring counters to attribute metrics to not only source code locations but also Java objects. DJXPerf presents Java object allocation contexts combined with their usage contexts and presents them ordered by the poor locality behaviors. DJXPerf's performance measurement, object attribution, and presentation techniques guide optimizing object allocation, layout, and access patterns. DJXPerf incurs only ~8% runtime overhead and ~5% memory overhead on average, requiring no modifications to hardware, OS, Java virtual machine, or application source code, which makes it attractive to use in production. Guided by DJXPerf, we study and optimize a number of Java and Scala programs, including well-known benchmarks and real-world applications, and demonstrate significant speedups.
翻译:爪哇是开发可缩放企业云应用程序的“ 上到” 编程语言选择 。 在这样的系统中, 即使是几个百分点的 CPU 节省时间也能够带来巨大的竞争优势和成本节约。 虽然在 Java 中, 以记忆级结构中的数据位置为重点的执行工具非常罕见。 在本文中, 我们展示了 DJXPerf, 是一个用于 Java 的轻量级、 以对象为中心的存储器描述器, 它将记忆- 高度性能测量器( 如缓存/ TLB 缺失) 与 Java 对象联系起来。 DJXPerf 使用硬件测试器的统计性能监测器样本, 不仅将指标不仅指源代码位置,而且还指爪哇 对象。 DJXPerf 展示了 Java 对象分配环境及其使用环境的罕见。 DJXPerf 的性能测量、 和访问模式。 DJX 和 AVA 的高级源码, 使得我们使用重要的硬件、 OS、 JA 和 Apress 的源码 和 源码 。