Full metadata
Title
Detecting Adversarial Examples by Measuring their Stress Response
Description
Machine learning (ML) and deep neural networks (DNNs) have achieved great success in a variety of application domains, however, despite significant effort to make these networks robust, they remain vulnerable to adversarial attacks in which input that is perceptually indistinguishable from natural data can be erroneously classified with high prediction confidence. Works on defending against adversarial examples can be broadly classified as correcting or detecting, which aim, respectively at negating the effects of the attack and correctly classifying the input, or detecting and rejecting the input as adversarial. In this work, a new approach for detecting adversarial examples is proposed. The approach takes advantage of the robustness of natural images to noise. As noise is added to a natural image, the prediction probability of its true class drops, but the drop is not sudden or precipitous. The same seems to not hold for adversarial examples. In other word, the stress response profile for natural images seems different from that of adversarial examples, which could be detected by their stress response profile. An evaluation of this approach for detecting adversarial examples is performed on the MNIST, CIFAR-10 and ImageNet datasets. Experimental data shows that this approach is effective at detecting some adversarial examples on small scaled simple content images and with little sacrifice on benign accuracy.
Date Created
2019
Contributors
- Sun, Lin (Author)
- Bazzi, Rida (Thesis advisor)
- Li, Baoxin (Committee member)
- Tong, Hanghang (Committee member)
- Arizona State University (Publisher)
Topical Subject
Resource Type
Extent
64 pages
Language
eng
Copyright Statement
In Copyright
Primary Member of
Peer-reviewed
No
Open Access
No
Handle
https://hdl.handle.net/2286/R.I.55594
Level of coding
minimal
Note
Masters Thesis Computer Science 2019
System Created
- 2020-01-14 09:17:21
System Modified
- 2021-08-26 09:47:01
- 3 years 2 months ago
Additional Formats