Principled Interpretability in Vision Models

From Mechanistic Understanding to Interpretable Models by Design

CVPR 2026 Tutorial

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)
๐Ÿ“… Date: June 3, 2026 (start at 1 pm) ยท ๐ŸŒ Location: Mile High 3C, Denver, CO, USA


Overview

As deep learning systems are increasingly deployed in high-stakes applications, understanding their behavior is critical for ensuring trust and safety. Interpretability provides essential tools to explain, debug, and improve these models. However, the field remains fragmented, spanning a wide range of methods and assumptions, while lacking standardized evaluation protocols.


Tutorial Outline

  1. Post-hoc mechanistic interpretability: Methods that analyze model internals at different levels of granularity (neurons, layers, circuits), with strengths and limitations.
  2. Faithfulness and reliability evaluation: Protocols and standardized metrics for assessing interpretability methods and producing actionable explanations.
  3. Interpretable DNN models by design: Concept bottleneck models and related approaches that align internal representations with human-understandable concepts.
  4. Applications: Debugging, model editing, and safety auditing in practical settings.

Agenda (New!)

Our tutorial is on Wednesday, June 3rd afternoon session, 1-5 pm:

Intended Audience

This tutorial is intended for researchers and practitioners working on computer vision and modern deep learning systems, as well as graduate students entering interpretability research. No prior experience in interpretability is required.

Materials

Please stay tuned on the Tutorial schedule and Agenda! Slides and supplementary materials will be posted here after the tutorial.


Organizers and Contact

โœ‰๏ธ Lily Weng (lweng@ucsd.edu), Tuomas Oikarinen (toikarinen@ucsd.edu), Ge Yan (geyan@ucsd.edu), Akshay Kulkarni (a2kulkarni@ucsd.edu)