TY - GEN
T1 - Adversarial Audio Attacks that Evade Temporal Dependency
AU - Liu, Heng
AU - Ditzler, Gregory
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/12/1
Y1 - 2020/12/1
N2 - As the real-world applications (image segmentation, speech recognition, machine translation, etc.) are increasingly adopting Deep Neural Networks (DNNs), DNN's vulnerabilities in a malicious environment have become an increasingly important research topic in adversarial machine learning. Adversarial machine learning (AML) focuses on exploring vulnerabilities and defensive techniques for machine learning models. Recent work has shown that most adversarial audio generation methods fail to consider audios' temporal dependency (TD) (i.e., adversarial audios exhibit weaker TD than benign audios). As a result, the adversarial audios are easily detectable by examining their TD. Therefore, one area of interest in the audio AML community is to develop a novel attack that evades a TD-based detection model. In this contribution, we revisit the LSTM model for audio transcription and propose a new audio attack algorithm that evades the TD-based detection by explicitly controlling the TD in generated adversarial audios. The experimental results show that the detectability of our adversarial audio is significantly reduced compared to the state-of-the-art audio attack algorithms. Furthermore, experiments also show that our adversarial audios remain nearly indistinguishable from benign audios with only negligible perturbation magnitude.
AB - As the real-world applications (image segmentation, speech recognition, machine translation, etc.) are increasingly adopting Deep Neural Networks (DNNs), DNN's vulnerabilities in a malicious environment have become an increasingly important research topic in adversarial machine learning. Adversarial machine learning (AML) focuses on exploring vulnerabilities and defensive techniques for machine learning models. Recent work has shown that most adversarial audio generation methods fail to consider audios' temporal dependency (TD) (i.e., adversarial audios exhibit weaker TD than benign audios). As a result, the adversarial audios are easily detectable by examining their TD. Therefore, one area of interest in the audio AML community is to develop a novel attack that evades a TD-based detection model. In this contribution, we revisit the LSTM model for audio transcription and propose a new audio attack algorithm that evades the TD-based detection by explicitly controlling the TD in generated adversarial audios. The experimental results show that the detectability of our adversarial audio is significantly reduced compared to the state-of-the-art audio attack algorithms. Furthermore, experiments also show that our adversarial audios remain nearly indistinguishable from benign audios with only negligible perturbation magnitude.
UR - http://www.scopus.com/inward/record.url?scp=85099703016&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85099703016&partnerID=8YFLogxK
U2 - 10.1109/SSCI47803.2020.9308597
DO - 10.1109/SSCI47803.2020.9308597
M3 - Conference contribution
AN - SCOPUS:85099703016
T3 - 2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020
SP - 639
EP - 646
BT - 2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020
Y2 - 1 December 2020 through 4 December 2020
ER -