TY - GEN
T1 - Multi-view Representation Learning from Malware to Defend Against Adversarial Variants
AU - Hu, James Lee
AU - Ebrahimi, Mohammadreza
AU - Li, Weifeng
AU - Li, Xin
AU - Chen, Hsinchun
N1 - Funding Information:
*: Corresponding author Acknowledgments: This material is based upon work supported by the National Science Foundation (NSF) under Secure and Trustworthy Cyberspace (1936370) , Cybersecurity Innovation for Cyberinfrastructure (1917117) , and Cybersecurity Scholarship-for-Service (1921485) programs.
Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Deep learning-based adversarial malware detectors have yielded promising results in detecting never-before-seen malware executables without relying on expensive dynamic behavior analysis and sandbox. Despite their abilities, these detectors have been shown to be vulnerable to adversarial malware variants - meticulously modified, functionality-preserving versions of original malware executables generated by machine learning. Due to the nature of these adversarial modifications, these adversarial methods often use a single view of malware executables (i.e., the binary/hexadecimal view) to generate adversarial malware variants. This provides an opportunity for the defenders (i.e., malware detectors) to detect the adversarial variants by utilizing more than one view of a malware file (e.g., source code view in addition to the binary view). The rationale behind this idea is that while the adversary focuses on the binary view, certain characteristics of the malware file in the source code view remain untouched which leads to the detection of the adversarial malware variants. To capitalize on this opportunity, we propose Adversarially Robust Multiview Malware Defense (ARMD), a novel multi-view learning framework to improve the robustness of DL-based malware detectors against adversarial variants. Our experiments on three renowned open-source deep learning-based malware detectors across six common malware categories show that ARMD is able to improve the adversarial robustness by up to seven times on these malware detectors.
AB - Deep learning-based adversarial malware detectors have yielded promising results in detecting never-before-seen malware executables without relying on expensive dynamic behavior analysis and sandbox. Despite their abilities, these detectors have been shown to be vulnerable to adversarial malware variants - meticulously modified, functionality-preserving versions of original malware executables generated by machine learning. Due to the nature of these adversarial modifications, these adversarial methods often use a single view of malware executables (i.e., the binary/hexadecimal view) to generate adversarial malware variants. This provides an opportunity for the defenders (i.e., malware detectors) to detect the adversarial variants by utilizing more than one view of a malware file (e.g., source code view in addition to the binary view). The rationale behind this idea is that while the adversary focuses on the binary view, certain characteristics of the malware file in the source code view remain untouched which leads to the detection of the adversarial malware variants. To capitalize on this opportunity, we propose Adversarially Robust Multiview Malware Defense (ARMD), a novel multi-view learning framework to improve the robustness of DL-based malware detectors against adversarial variants. Our experiments on three renowned open-source deep learning-based malware detectors across six common malware categories show that ARMD is able to improve the adversarial robustness by up to seven times on these malware detectors.
KW - Adversarial Machine Learning
KW - Adversarial Malware Variants
KW - Adversarial Robustness
KW - Deep Learning-based Malware Detectors
KW - Multi-View Learning
UR - http://www.scopus.com/inward/record.url?scp=85148453933&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85148453933&partnerID=8YFLogxK
U2 - 10.1109/ICDMW58026.2022.00066
DO - 10.1109/ICDMW58026.2022.00066
M3 - Conference contribution
AN - SCOPUS:85148453933
T3 - IEEE International Conference on Data Mining Workshops, ICDMW
SP - 451
EP - 458
BT - Proceedings - 22nd IEEE International Conference on Data Mining Workshops, ICDMW 2022
A2 - Candan, K. Selcuk
A2 - Dinh, Thang N.
A2 - Thai, My T.
A2 - Washio, Takashi
PB - IEEE Computer Society
T2 - 22nd IEEE International Conference on Data Mining Workshops, ICDMW 2022
Y2 - 28 November 2022 through 1 December 2022
ER -