#Correct way to do adversarial fine-tuning

3 messages · Page 1 of 1 (latest)

pale star
#

Hi guys, I'm trying to improve the robustness of a well-trained model.

Goal
I have a relatively small set of adversarial examples (about 1000, CIFAR10), and a ResNet50 model with about 96% accuracy. The goal is to improve the model's accuracy on the given adversarial examples without sacrificing accuracy on the clean dataset.

What I did
I found that adversarial fine-tuning can serve my purpose, so I used the small adversarial dataset to fine-tune the model for about 5 epochs, using a small learning rate (1e-5, with SGD, 2e-5 weight decay). However, the fine-tuned model can only achieve about 25% accuracy on the adversarial dataset, and its accuracy on the clean dataset drops to about 80% (or even lower, I cannot really remember).

Question
So what is the correct way to do adversarial fine-tuning? I think that I'm missing something in the fine-tuning. Maybe a smaller learning rate? Or a larger adversarial dataset? Any advice or link to useful resources would be appreciated. Thanks in advance.

haughty pond
#

Firstly, I’d suggest using Adam instead of SGD. Here’s a paper detailing the benefits and costs of using certain optimization and activation functions for training adversarially.
https://academia.edu/resource/work/85546902

If that fails, this repository is the holy grail of adversarial ML. I used it for over 2 years to do all of my training. It has tutorials and also abstracts much of the math and complexity by allowing you to just use the different adversarial attacks as functions.
https://github.com/cleverhans-lab/cleverhans

Lastly, AdvML is an uphill and losing battle. When I first tried doing adversarial research, my mentor told me that it was probably one of the most difficult places to study because we simply just don’t know how to effectively defeat adversarial attacks yet. Random perturbations are becoming more and more dangerous and as such, you’re always going to find those 25%s on the adversarial dataset and then lower after. Training with things like GANs can help, but defending against attackers and being the bastions of tomorrows technology is a difficult job where the attackers always win.

GitHub

An adversarial example library for constructing attacks, building defenses, and benchmarking both - GitHub - cleverhans-lab/cleverhans: An adversarial example library for constructing attacks, buil...

pale star