This repository contains the code for "Improving the Transferability of Adversarial Examples with Diverse Gradients".
We propose a Diverse Gradient Method (DGM) to craft transferable adversarial examples.
This code is implemented in PyTorch, and we have tested the code under the following environment settings:
- python = 3.6.2
- torch = 1.5.0
- torchvision = 0.6.0
- advertorch = 0.2.3
- pretrainedmodels = 0.7.4
Additionally, we reproduced DI, TI, MI and PI in Pytorch in attack_method.py.
-
Download the dataset from Dataset and extract images to the path
./SubImageNetVal/ -
Training derived model by knowledge distillation, diverse gradient information, For ResNet-152 as the source model,
python train_kd.py --save_root './result' --img_root './SubImageNetVal/' --T 20 --note 'T20_resnet152' --arch 'resnet152'
Or you can directly download our trained weight files form derived model weight files
-
Generate adversarial examples and save them into path
./adv_path/. For ResNet-152 as the source model,python attack_distillation.py --input_dir './SubImageNetVal/' --output_dir './adv_path/' --attack_method 'pgd' --ensemble 1 --snet_dir './result/T20_resnet152/checkpoint.pth.tar'
--snet_dir is the weight file path, you can directily download or training by step 2.
-
Evaluate the transferability of generated adversarial examples in
./adv_path/.AdvPath="./adv_path/" bash evaluate.sh
All pretrained models in our paper can be found online:
-
For undenfended models, we use pretrained models in pretrainedmodels;
-
For denfended models, they are trained by Ensemble Adversarial Training [1],
and pretrained results can be found in Tensorflow version. or in Pytroch version
[1] Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, Patrick McDaniel. Ensemble Adversarial Training: Attacks and Defenses. In ICLR, 2018.