The threat of a malicious user interfering with network traffic so as to deny access to resources is an inherent vulnerability of wireless networks. To combat this threat, physical layer waveforms that are resilient to interference are used to relay critical traffic. These waveforms are designed to make it difficult for a malicious user to both deny access to network resources and avoid detection. If a malicious user has perfect knowledge of the waveform being used, it can avoid detection and deny network throughput, but this knowledge is naturally limited in practice. In this work, the threat of a malicious user that can implicitly learn the nature of the waveform being used simply by observing reactions to its behavior is analyzed and potential mitigation techniques are discussed. The results show that using recurrent neural networks to implement deep Q-learning, a malicious user can converge on an optimal interference policy that simultaneously minimizes the potential for it to be detected and maximizes its impediment on network traffic.