TY - GEN
T1 - Single-Shot Black-Box Adversarial Attacks Against Malware Detectors
T2 - 19th Annual IEEE International Conference on Intelligence and Security Informatics, ISI 2021
AU - Hu, James Lee
AU - Ebrahimi, Mohammadreza
AU - Chen, Hsinchun
N1 - Funding Information:
*: Corresponding author Acknowledgments: This material is based upon work supported by the National Science Foundation (NSF) under Secure and Trustworthy Cyberspace (1936370), Cybersecurity Innovation for Cyberinfrastructure (1917117), and Cybersecurity Scholarship-for-Service (1921485) programs.
Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Deep Learning (DL)-based malware detectors are increasingly adopted for early detection of malicious behavior in cybersecurity. However, their sensitivity to adversarial malware variants has raised immense security concerns. Generating such adversarial variants by the defender is crucial to improving the resistance of DL-based malware detectors against them. This necessity has given rise to an emerging stream of machine learning research, Adversarial Malware example Generation (AMG), which aims to generate evasive adversarial malware variants that preserve the malicious functionality of a given malware. Within AMG research, black-box method has gained more attention than white-box methods. However, most black-box AMG methods require numerous interactions with the malware detectors to generate adversarial malware examples. Given that most malware detectors enforce a query limit, this could result in generating non-realistic adversarial examples that are likely to be detected in practice due to lack of stealth. In this study, we show that a novel DL-based causal language model enables single-shot evasion (i.e., with only one query to malware detector) by treating the content of the malware executable as a byte sequence and training a Generative Pre-Trained Transformer (GPT). Our proposed method, MalGPT, significantly outperformed the leading benchmark methods on a real-world malware dataset obtained from VirusTotal, achieving over 24.51% evasion rate. MalGPT enables cybersecurity researchers to develop advanced defense capabilities by emulating large-scale realistic AMG.
AB - Deep Learning (DL)-based malware detectors are increasingly adopted for early detection of malicious behavior in cybersecurity. However, their sensitivity to adversarial malware variants has raised immense security concerns. Generating such adversarial variants by the defender is crucial to improving the resistance of DL-based malware detectors against them. This necessity has given rise to an emerging stream of machine learning research, Adversarial Malware example Generation (AMG), which aims to generate evasive adversarial malware variants that preserve the malicious functionality of a given malware. Within AMG research, black-box method has gained more attention than white-box methods. However, most black-box AMG methods require numerous interactions with the malware detectors to generate adversarial malware examples. Given that most malware detectors enforce a query limit, this could result in generating non-realistic adversarial examples that are likely to be detected in practice due to lack of stealth. In this study, we show that a novel DL-based causal language model enables single-shot evasion (i.e., with only one query to malware detector) by treating the content of the malware executable as a byte sequence and training a Generative Pre-Trained Transformer (GPT). Our proposed method, MalGPT, significantly outperformed the leading benchmark methods on a real-world malware dataset obtained from VirusTotal, achieving over 24.51% evasion rate. MalGPT enables cybersecurity researchers to develop advanced defense capabilities by emulating large-scale realistic AMG.
KW - Adversarial malware variants
KW - deep learning-based language models
KW - generative pre-trained transformers
KW - single-shot black-box evasion
UR - http://www.scopus.com/inward/record.url?scp=85123475318&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85123475318&partnerID=8YFLogxK
U2 - 10.1109/ISI53945.2021.9624787
DO - 10.1109/ISI53945.2021.9624787
M3 - Conference contribution
AN - SCOPUS:85123475318
T3 - Proceedings - 2021 IEEE International Conference on Intelligence and Security Informatics, ISI 2021
BT - Proceedings - 2021 IEEE International Conference on Intelligence and Security Informatics, ISI 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 2 November 2021 through 3 November 2021
ER -