Feature extraction is critical for TLS traffic analysis using machine learning techniques, which it is also very difficult and time-consuming requiring huge engineering efforts. We designed and implemented DeepTLS, a system which extracts full spectrum of features from pcaps across meta, statistical, SPLT, byte distribution, TLS header and certificates. The backend is written in C++ to achieve high performance, which can analyze a GB-size pcap in a few minutes. DeepTLS was thoroughly evaluated against two state-of-the-art tools Joy and Zeek with four well-known malicious traffic datasets consisted of 160 pcaps. Evaluation results show DeepTLS has advantage of analyzing large pcaps with half analysis time, and identified more certificates with acceptable performance loss compared with Joy. DeepTLS can significantly accelerate machine learning pipeline by reducing feature extraction time from hours even days to minutes. The system is online at https://deeptls.com, where test artifacts can be viewed and validated. In addition, two open source tools Pysharkfeat and Tlsfeatmark are also released.
翻译:利用机器学习技术进行TLS交通分析,其特性提取对于使用机器学习技术进行TLS交通分析至关重要,这种技术也非常困难和费时,需要大量工程工作。我们设计并实施了DeepTLS系统,该系统从元、统计、SPLT、字节分发、TLS信头和证书中提取了全部功能。后端以C+++的形式写成,以达到高性能,可以在几分钟内分析一个GB尺寸的顶部。深TLS系统根据两个最先进的工具Joy和Zeek进行了彻底评估,其中四个已知的恶意交通数据集由160个盖组成。评价结果显示,深TLS具有用半个分析时间分析大盖子的优势,并查明了与乔伊相比更多的可接受性损证书。深TLS可以大大加快机器学习管道的速度,将功能提取时间从几小时缩短到几分钟。该系统可在https://deeptls.com上在线,在那里可以查看和验证测试工艺品。此外,两个开放源工具Pysharkfatfat和Tlsfeatmarkmart也被释放。