Learning Contextualized Action Representations in Sequential Decision Making for Adversarial Malware Optimization

Reza Ebrahimi, Jason Pacheco, James Hu, Hsinchun Chen

Research output: Contribution to journalArticlepeer-review

Abstract

Deep learning (DL)-based malware detectors have shown promise in swiftly detecting unseen malware without expensive dynamic malware behavior analysis. These detectors have been shown to be susceptible to adversarial malware variants generated from meticulously modifying known malware to mislead detectors into recognizing them as benign. Being able to automatically generate optimized functional adversarial malware variants by defenders is crucial to effective cyber defense and staying ahead of the adversary. Current adversarial malware example generation methods often assume threat models with any of the following four restrictions: (1) requiring access to insider knowledge about malware detectors, (2) an unlimited size of adversarial modifications, (3) an unlimited number of queries to malware detector, and (4) relying on dynamic analysis of malware behavior in a sandbox. Drawing on Actor-Critic Reinforcement Learning (RL), we propose a novel black-box binary manipulation method for adversarial malware optimization, named Actor-Critic with Contextualized Action Representations (AC-CAR), to generate malware variants without these restrictions. AC-CAR leverages two novel components, a contextualized policy and a neural language model-based RL-augmented top-k sampling method. Unlike current methods, AC-CAR can utilize tens of thousands of actions to augment malware executables for evading DL-based malware detectors. AC-CAR yields an approximately 2-fold performance increase over the current methods on average, while decreasing the payload size to 20 times smaller than leading methods. We show that using the malware variants generated by AC-CAR in an adversarial re-training procedure improves malware detectors' robustness against adversarial variants by 29.65% on average.

Original languageEnglish (US)
JournalIEEE Transactions on Dependable and Secure Computing
DOIs
StateAccepted/In press - 2024

Keywords

  • actor-critic reinforcement learning
  • adversarial malware example generation
  • contextualized action representations

ASJC Scopus subject areas

  • General Computer Science
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Learning Contextualized Action Representations in Sequential Decision Making for Adversarial Malware Optimization'. Together they form a unique fingerprint.

Cite this