We formulate low-level malware detection using algorithms based on feature matching as Order-based Malware Detection with Critical Instructions (General-OMDCI): given a pattern in the form of a sequence \(M\) of colored blocks, where each block contains a critical character (representing a unique sequence of critical instructions potentially associated with malware but without certainty), and a program \(A\), represented as a sequence of \(n\) colored blocks with critical characters, the goal is to find two subsequences, \(M'\) of \(M\) and \(A'\) of \(A\), with blocks matching in color and whose critical characters form a permutation of each other. When $M$ is a permutation in both colors and critical characters the problem is called OMDCI. If we additionally require $M'=M$, then the problem is called OMDCI+; if in this case $d=|M|$ is used as a parameter, then the OMDCI+ problem is easily shown to be FPT. Our main (negative) results are on the cases when $|M|$ is arbitrary and are summarized as follows: OMDCI+ is NP-complete, which implies OMDCI is also NP-complete. For the special case of OMDCI, deciding if the optimal solution has length $0$ (i.e., deciding if no part of \(M\) appears in \(A\)) is co-NP-hard. As a result, the OMDCI problem does not admit an FPT algorithm unless P=co-NP. In summary, our results imply that using algorithms based on feature matching to identify malware or determine the absence of malware in a given low-level program are both hard.
翻译:暂无翻译