Diffusion based Counterfactual Attacks

Diffusion based Counterfactual Attacks

Overview

Counterfactual explanations  (CEs) aim to alter output labels with minimal disruption. While adversarial attacks do it regardless of the nature of these changes, CEs restrain its attack to remain in the data manifold. To this end, current methods utilize diffusion processes ([1],[2]) for direct optimization, which, although effective, require significant memory and computational resources for back-propagation.

Objectives

  • Exploring alternative applications of diffusion for the counterfactual generative process
  • Studying the limitation of [1] and the use of the adversarial attack framework

Prerequisite

  • Python + PyTorch proficiency
  • Experience with Kubernetes + Docker
  • Knowledge of diffusion methods for 2D images generation

Contact

References

[1] Jeanneret, G., Simon, L., & Fredéric Jurie (2023). Adversarial Counterfactual Visual Explanations. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[2] Jeanneret, G., Simon, L., & Fredéric Jurie (2022). Diffusion Models for Counterfactual Explanations. In Proceedings of the Asian Conference on Computer Vision (ACCV).