Recent breakthroughs in cryptography, widely reported on in the media showed that it is possible to reassemble any given program into a mathematical jigsaw puzzle so complicated that, although it functions identically to the original, divining its purpose without running it is effectively impossible. We’ll look at the security implications of these findings.
Do the results imply that any malware program can be obfuscated to be indistinguishable from benign programs?
No.
This is because there are restrictions on this obfuscation. Specifically, the authors only show the possibility of “indistinguishability obfuscation”, which really means that if one obfuscates two different programs which perform the same function, they would still function like the non-obfuscated versions, taking the same inputs from the outside environment and outputting the same outputs, but it would be impossible to tell them apart.
More specifically, they show that given two programs A
and B
, where A'
is obfuscated version of A
and B'
is an obfuscated verion of B
, where both take in x
as input and have y
as output, we would not be able to distinguish between
A'(x) => y
and B'(x) => y
.
Here we can’t tell whether A’
is an obfuscated version of A
or B
.
This is useful because A
might be an implementation where the behavior of the program was completely transparent while B
might be an implementation written by highly skilled programmers who spent a thousand years making every effort to hide the inner workings of the program: after obfuscation, program A’
is at least as hard to analyze as program B’
. Moreover, B
represents any possible program that performs the same function as A
; actually writing B
is unnecessary.
If such an obfuscator is created for a commonly used programming language and if it is commonly used by benign software, it will make malware impossible to detect using static analysis, precisely because static analysis relies on analyzing implementation details. If someone were to obfuscate malware, it would still behave like malware, but it wouldn’t be possible to determine how the malware is implemented.
While static analysis would be fruitless, this does not imply that malware obfuscated in this way would be impossible to detect. A sandbox would still be able to detect this new program, because it can look at the program’s behavior when interacting with the outside environment, which would be malicious by definition.
So although the results are very interesting, they do not imply that any malware obfuscated in this way would be impossible to detect.