Principled Interpretability in Vision Models

From Mechanistic Understanding to Interpretable Models by Design

CVPR 2026 Tutorial

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2026)
📅 Date: June 3, 2026 (start at 1 pm) · 🌍 Location: Mile High 3C, Denver, CO, USA

Overview

As deep learning systems are increasingly deployed in high-stakes applications, understanding their behavior is critical for ensuring trust and safety. Interpretability provides essential tools to explain, debug, and improve these models. However, the field remains fragmented, spanning a wide range of methods and assumptions, while lacking standardized evaluation protocols.

This tutorial aims to provide a unified overview of interpretability in deep learning – bridging post-hoc mechanistic understanding and methods to design inherently interpretable deep learning models.
By the end of this tutorial, attendees will gain a solid understanding of modern interpretability methods for deep learning models, how to rigorously evaluate them, and open research directions in this critical area.

Tutorial Outline

Post-hoc mechanistic interpretability: Methods that analyze model internals at different levels of granularity (neurons, layers, circuits), with strengths and limitations.
Faithfulness and reliability evaluation: Protocols and standardized metrics for assessing interpretability methods and producing actionable explanations.
Interpretable DNN models by design: Concept bottleneck models and related approaches that align internal representations with human-understandable concepts.
Applications: Debugging, model editing, and safety auditing in practical settings.

Agenda (New!)

Our tutorial is on Wednesday, June 3rd afternoon session, 1-5 pm:

Part 1: Introduction & Backgrounds
Part 2: Post-hoc model-level interpretability
Part 3: Faithful and reliability evaluations
Part 4: Interpretable DNN models by design
Part 5: Applications, Demos, and Technical Q/A

Intended Audience

This tutorial is intended for researchers and practitioners working on computer vision and modern deep learning systems, as well as graduate students entering interpretability research. No prior experience in interpretability is required.

Materials

Please stay tuned on the Tutorial schedule and Agenda! Slides and supplementary materials will be posted here after the tutorial.

Organizers and Contact

✉️ Lily Weng (lweng@ucsd.edu), Tuomas Oikarinen (toikarinen@ucsd.edu), Ge Yan (geyan@ucsd.edu), Akshay Kulkarni (a2kulkarni@ucsd.edu)