Fault localization is a fundamental aspect of debugging, aiming to identify code regions likely responsible for failures. Traditional techniques primarily correlate statement execution with failures, yet program behavior is influenced by diverse execution features-such as variable values, branch conditions, and definition-use pairs-that can provide richer diagnostic insights. In an empirical study of 310 bugs across 20 projects, we analyzed 17 execution features and assessed their correlation with failure outcomes. Our findings suggest that fault localization benefits from a broader range of execution features: (1) Scalar pairs exhibit the strongest correlation with failures; (2) Beyond line executions, def-use pairs and functions executed are key indicators for fault localization; and (3) Combining multiple features enhances effectiveness compared to relying solely on individual features. Building on these insights, we introduce a debugging approach to diagnose failure circumstances. The approach extracts fine-grained execution features and trains a decision tree to differentiate passing and failing runs. From this model, we derive a diagnosis that pinpoints faulty locations and explains the underlying causes of the failure. Our evaluation demonstrates that the generated diagnoses achieve high predictive accuracy, reinforcing their reliability. These interpretable diagnoses empower developers to efficiently debug software by providing deeper insights into failure causes.
翻译:暂无翻译