The highest level in the Endsley situation awareness model is called projection when the status of elements in the environment in the near future is predicted. In cybersecurity situation awareness, the projection for an Advanced Persistent Threat (APT) requires predicting the next step of the APT. The threats are constantly changing and becoming more complex. As supervised and unsupervised learning methods require APT datasets for projecting the next step of APTs, they are unable to identify unknown APT threats. In reinforcement learning methods, the agent interacts with the environment, and so it might project the next step of known and unknown APTs. So far, reinforcement learning has not been used to project the next step for APTs. In reinforcement learning, the agent uses the previous states and actions to approximate the best action of the current state. When the number of states and actions is abundant, the agent employs a neural network which is called deep learning to approximate the best action of each state. In this paper, we present a deep reinforcement learning system to project the next step of APTs. As there exists some relation between attack steps, we employ the Long- Short-Term Memory (LSTM) method to approximate the best action of each state. In our proposed system, based on the current situation, we project the next steps of APT threats.
翻译:在网络安全意识方面,预测高级持久性威胁(APT)需要预测PT的下一步。威胁正在不断变化,而且变得更加复杂。随着监督和不受监督的学习方法需要APT的数据集来预测APT的下一步,他们无法发现未知的APT威胁。在强化学习方法中,代理器与环境相互作用,因此它可能预测已知和未知APT的下一步。到目前为止,强化学习尚未用于预测APT的下一步。在强化学习中,代理器利用先前的状态和行动来大致介绍当前状态的最佳行动。当州和行动数量众多时,代理器使用神经网络,需要深度学习来估计每个州的最佳行动。在本文件中,我们提出了一个深度强化学习系统来预测APT的下一步。由于攻击步骤之间存在某些关系,我们使用了长期记忆系统,我们目前的威胁项目方法以我们目前的最佳步骤为基础。