TY - JOUR
T1 - Learning Contextualized Action Representations in Sequential Decision Making for Adversarial Malware Optimization
AU - Ebrahimi, Reza
AU - Pacheco, Jason
AU - Hu, James
AU - Chen, Hsinchun
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2024
Y1 - 2024
N2 - Deep learning (DL)-based malware detectors have shown promise in swiftly detecting unseen malware without expensive dynamic malware behavior analysis. These detectors have been shown to be susceptible to adversarial malware variants generated from meticulously modifying known malware to mislead detectors into recognizing them as benign. Being able to automatically generate optimized functional adversarial malware variants by defenders is crucial to effective cyber defense and staying ahead of the adversary. Current adversarial malware example generation methods often assume threat models with any of the following four restrictions: (1) requiring access to insider knowledge about malware detectors, (2) an unlimited size of adversarial modifications, (3) an unlimited number of queries to malware detector, and (4) relying on dynamic analysis of malware behavior in a sandbox. Drawing on Actor-Critic Reinforcement Learning (RL), we propose a novel black-box binary manipulation method for adversarial malware optimization, named Actor-Critic with Contextualized Action Representations (AC-CAR), to generate malware variants without these restrictions. AC-CAR leverages two novel components, a contextualized policy and a neural language model-based RL-augmented top-k sampling method. Unlike current methods, AC-CAR can utilize tens of thousands of actions to augment malware executables for evading DL-based malware detectors. AC-CAR yields an approximately 2-fold performance increase over the current methods on average, while decreasing the payload size to 20 times smaller than leading methods. We show that using the malware variants generated by AC-CAR in an adversarial re-training procedure improves malware detectors' robustness against adversarial variants by 29.65% on average.
AB - Deep learning (DL)-based malware detectors have shown promise in swiftly detecting unseen malware without expensive dynamic malware behavior analysis. These detectors have been shown to be susceptible to adversarial malware variants generated from meticulously modifying known malware to mislead detectors into recognizing them as benign. Being able to automatically generate optimized functional adversarial malware variants by defenders is crucial to effective cyber defense and staying ahead of the adversary. Current adversarial malware example generation methods often assume threat models with any of the following four restrictions: (1) requiring access to insider knowledge about malware detectors, (2) an unlimited size of adversarial modifications, (3) an unlimited number of queries to malware detector, and (4) relying on dynamic analysis of malware behavior in a sandbox. Drawing on Actor-Critic Reinforcement Learning (RL), we propose a novel black-box binary manipulation method for adversarial malware optimization, named Actor-Critic with Contextualized Action Representations (AC-CAR), to generate malware variants without these restrictions. AC-CAR leverages two novel components, a contextualized policy and a neural language model-based RL-augmented top-k sampling method. Unlike current methods, AC-CAR can utilize tens of thousands of actions to augment malware executables for evading DL-based malware detectors. AC-CAR yields an approximately 2-fold performance increase over the current methods on average, while decreasing the payload size to 20 times smaller than leading methods. We show that using the malware variants generated by AC-CAR in an adversarial re-training procedure improves malware detectors' robustness against adversarial variants by 29.65% on average.
KW - actor-critic reinforcement learning
KW - adversarial malware example generation
KW - contextualized action representations
UR - http://www.scopus.com/inward/record.url?scp=85207127535&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85207127535&partnerID=8YFLogxK
U2 - 10.1109/TDSC.2024.3477272
DO - 10.1109/TDSC.2024.3477272
M3 - Article
AN - SCOPUS:85207127535
SN - 1545-5971
JO - IEEE Transactions on Dependable and Secure Computing
JF - IEEE Transactions on Dependable and Secure Computing
ER -