Accurate and timely detection of cyber threats is critical to keeping our online economy and data safe. A key technique in early detection is the classification of unusual patterns of network behaviour, often hidden as low-frequency events within complex time-series packet flows. One of the ways in which such anomalies can be detected is to analyse the information entropy of the payload within individual packets, since changes in entropy can often indicate suspicious activity - such as whether session encryption has been compromised, or whether a plaintext channel has been co-opted as a covert channel. To decide whether activity is anomalous we need to compare real-time entropy values with baseline values, and while the analysis of entropy in packet data is not particularly new, to the best of our knowledge there are no published baselines for payload entropy across common network services. We offer two contributions: 1) We analyse several large packet datasets to establish baseline payload information entropy values for common network services, 2) We describe an efficient method for engineering entropy metrics when performing flow recovery from live or offline packet data, which can be expressed within feature subsets for subsequent analysis and machine learning applications.
翻译:暂无翻译