Detection of malicious behavior in a large network is a challenging problem for machine learning in computer security, since it requires a model with high expressive power and scalable inference. Existing solutions struggle to achieve this feat -- current cybersec-tailored approaches are still limited in expressivity, and methods successful in other domains do not scale well for large volumes of data, rendering frequent retraining impossible. This work proposes a new perspective for learning from graph data that is modeling network entity interactions as a large heterogeneous graph. High expressivity of the method is achieved with neural network architecture HMILnet that naturally models this type of data and provides theoretical guarantees. The scalability is achieved by pursuing local graph inference, i.e., classifying individual vertices and their neighborhood as independent samples. Our experiments exhibit improvement over the state-of-the-art Probabilistic Threat Propagation (PTP) algorithm, show a further threefold accuracy improvement when additional data is used, which is not possible with the PTP algorithm, and demonstrate the generalization capabilities of the method to new, previously unseen entities.
翻译:暂无翻译