The threat from ransomware continues to grow both in the number of affected victims as well as the cost incurred by the people and organisations impacted in a successful attack. In the majority of cases, once a victim has been attacked there remain only two courses of action open to them; either pay the ransom or lose their data. One common behaviour shared between all crypto ransomware strains is that at some point during their execution they will attempt to encrypt the users' files. Previous research Penrose et al. (2013); Zhao et al. (2011) has highlighted the difficulty in differentiating between compressed and encrypted files using Shannon entropy as both file types exhibit similar values. One of the experiments described in this paper shows a unique characteristic for the Shannon entropy of encrypted file header fragments. This characteristic was used to differentiate between encrypted files and other high entropy files such as archives. This discovery was leveraged in the development of a file classification model that used the differential area between the entropy curve of a file under analysis and one generated from random data. When comparing the entropy plot values of a file under analysis against one generated by a file containing purely random numbers, the greater the correlation of the plots is, the higher the confidence that the file under analysis contains encrypted data.
翻译:在大多数案件中,一旦受害者遭到袭击,他们只能采取两种行动:要么支付赎金,要么丢失数据。所有加密赎金软件菌株之间的一种共同行为是,在执行过程中,在某个时候,他们将试图对用户文件进行加密。 Penrose等人(2013年);Zhao等人(2011年)的研究突出表明,很难用香农安特罗普(Shannon entropy)来区分压缩和加密文件,因为这两种文件类型都具有类似的价值。本文描述的实验之一显示了加密文件头片香农安特罗普(Shannon entropy)的独特特征。这一特征用于区分加密文件和其他高加密文件(如档案)之间的一种共同行为。这一发现被用于开发一个文件分类模型,该模型使用正在分析的文件的酶曲线和随机数据之间的差异区域。在比较所分析的文件的高级文件的加密图值时,根据一个文件生成的加密文件的加密数字具有更大的相关性。