Multi-source logs provide a comprehensive overview of ongoing system activities, allowing for in-depth analysis to detect potential threats. A practical approach for threat detection involves explicit extraction of entity triples (subject, action, object) towards building provenance graphs to facilitate the analysis of system behavior. However, current log parsing methods mainly focus on retrieving parameters and events from raw logs while approaches based on entity extraction are limited to processing a single type of log. To address these gaps, we contribute with a novel unified framework, coined UTLParser. UTLParser adopts semantic analysis to construct causal graphs by merging multiple sub-graphs from individual log sources in labeled log dataset. It leverages domain knowledge in threat hunting such as Points of Interest. We further explore log generation delays and provide interfaces for optimized temporal graph querying. Our experiments showcase that UTLParser overcomes drawbacks of other log parsing methods. Furthermore, UTLParser precisely extracts explicit causal threat information while being compatible with enormous downstream tasks.
翻译:暂无翻译