Ontologies are a standard for semantic schemata in many knowledge-intensive domains of human interest. They are now becoming increasingly important also in areas until very recently dominated by subsymbolic representations and machine-learning-based data processing. One such area is information security, and more specifically malware detection. We propose PE Malware Ontology that offers a reusable semantic schema for Portable Executable (PE, Windows binary format) malware files. The ontology was inspired by the structure of the data in the EMBER dataset and it currently covers the data intended for static malware analysis. With this proposal, we hope to achieve: a) a unified semantic representation for PE malware datasets that are available or will be published in the future; (b) applicability of symbolic, neural-symbolic, or otherwise explainable approaches in the PE Malware domain that may lead to improved interpretability of results which may now be characterized by the terms defined in the ontology; and (c)by joint publishing of semantically treated EMBER data, including fractional datasets, also improved reproducibility of experiments.
翻译:肿瘤是人类感兴趣的许多知识密集型领域的语义系统图案的标准,在人类感兴趣的许多知识密集型领域,它们现在变得日益重要,直到最近为止,在以子符号表和基于机器学习的数据处理为主的领域,它们也变得日益重要。其中一个领域是信息安全,更具体地说就是恶意检测。我们提议PE Malware Ontology为可移植执行(PE, Windows 二进制格式)恶意文件提供可重复使用的语义系统图案。本项学是由EMBER数据集中的数据结构所启发的,目前它涵盖了用于静态恶意分析的数据。我们希望通过这一提议实现:(a) PE 恶意软件数据集的统一语义代表,这些数据集已经存在或将来将公布;(b) 符号、神经-心理外观,或者在PE Malware域中可以解释的方法的适用性,这些方法可能会改进结果的可解释性,而这些结果现在可以用本系统定义的术语加以描述;(c) 通过联合出版经语义处理的EMBER数据,包括分解数据的可改进的实验。