Predicting the resilience of obfuscated code against symbolic execution attacks via machine learning

Sebastian Banescu, Christian Collberg, Alexander Pretschner

Research output: Chapter in Book/Report/Conference proceedingConference contribution

49 Scopus citations

Abstract

Software obfuscation transforms code such that it is more difficult to reverse engineer. However, it is known that given enough resources, an attacker will successfully reverse engineer an obfuscated program. Therefore, an open challenge for software obfuscation is estimating the time an obfuscated program is able to withstand a given reverse engineering attack. This paper proposes a general framework for choosing the most relevant software features to estimate the effort of automated attacks. Our framework uses these software features to build regression models that can predict the resilience of different software protection transformations against automated attacks. To evaluate the effectiveness of our approach, we instantiate it in a case-study about predicting the time needed to deobfuscate a set of C programs, using an attack based on symbolic execution. To train regression models our system requires a large set of programs as input. We have therefore implemented a code generator that can generate large numbers of arbitrarily complex random C functions. Our results show that features such as the number of community structures in the graph-representation of symbolic path-constraints, are far more relevant for predicting deobfuscation time than other features generally used to measure the potency of control-flow obfuscation (e.g. cyclomatic complexity). Our best model is able to predict the number of seconds of symbolic execution-based deobfuscation attacks with over 90% accuracy for 80% of the programs in our dataset, which also includes several realistic hash functions.

Original languageEnglish (US)
Title of host publicationProceedings of the 26th USENIX Security Symposium
PublisherUSENIX Association
Pages661-678
Number of pages18
ISBN (Electronic)9781931971409
StatePublished - 2017
Event26th USENIX Security Symposium - Vancouver, Canada
Duration: Aug 16 2017Aug 18 2017

Publication series

NameProceedings of the 26th USENIX Security Symposium

Conference

Conference26th USENIX Security Symposium
Country/TerritoryCanada
CityVancouver
Period8/16/178/18/17

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Predicting the resilience of obfuscated code against symbolic execution attacks via machine learning'. Together they form a unique fingerprint.

Cite this