repository contains detectors of adversarial examples
to generate attack (FGSM) was used foolbox 2.4.0 attack.py
adversarials_detection.ipynb demonstrates how to use adversarial detector
all experiments were performed using VGG architecture vgg.py
(to train model one could use cifar10training.py)
Detector algorithms are in detectors.py
Original image:
The same image with small perturbations:
Softmax output of NN for the pictures above: As we see, "probability" of real class in perturbed image is still significant. It could be
used to teach binary classificator (0 - real, 1 - adversarial) with softmax as features.