Experiments
We compare with existing test-time adversarial defenses like HedgeDefense (HD) [5], SODEF [6], and
CAAA [7] against image-wise worst-case (IW-WC) adaptive attacks [8]
including AutoAttack, RayS, and their transfer and EoT variants. Please refer to Sec. 5 of our paper for complete details and more results.
Overall, we find that our proposed IG-Defense obtains consistent improvements in image-wise worst-case (IW-WC) robust accuracy, unlike existing test-time defenses.
Table 1. Comparison of our IG-Defense with existing test-time adversarial defenses. The number in green/red indicates gain/drop in image-wise worst-case robust accuracy compared to the base model without any test-time defense.
Cite this work
A. Kulkarni, and T.-W. Weng,
Interpretability-Guided Test-Time Adversarial Defense, ECCV 2024.
@inproceedings{kulkarni2024igdefense,
title={Interpretability-Guided Test-Time Adversarial Defense},
author={Kulkarni, Akshay and Weng, Tsui-Wei},
booktitle={European Conference on Computer Vision},
year={2024}
}