First author Srishti Gautam (middle), and two of her co-authors, Ahcene Boubekki (left) and Stine Hansen (right). Photo: UiT


New Visual Intelligence paper accepted to NeurIPS

September 23, 2022

With Srishti Gautam as the first author, and under the leadership of Michael Kampffmeyer, both UiT, Visual Intelligence is pleased to have a new deep learning paper accepted at NeurIPS 2022.


The current widely used deep learning models have a major drawback of behaving as a black-box. For example, a deep learning network trained on classifying natural images does not have the capability to explain why it thinks that an image is of a dog.

The black-box nature of these high-accuracy achieving models is a roadblock in safety-critical domains such as healthcare, law or autonomous driving. This is where the field of Explainable AI (XAI) pitches in where transparent models are being developed which have the inherent capability to explain their decision making process while they are making them.

Advancing this field, we propose our novel ProtoVAE, which is a fully transparent deep learning model. ProtoVAE is capable of generating trustworthy explanations for its decisions while not degrading prediction accuracy, as opposed to previous methods. We envision that ProtoVAE has a potential to be an important tool for researchers in a variety of safety critical applications in the years to come.

“We provide a categorization of existing self-explaining approaches with a set of properties, namely transparency, diversity and trustworthiness, that these need to adhere to. We design a novel probabilistic self-explaining model and demonstrate that it fulfils the said properties while achieving on-par performance with black-box models”

says Srishti Gautam.

This work was conducted in a collaboration with Prof. Marina Höhne at the Technical University of Berlin and the BIFOLD institute.


ProtoVAE: A Trustworthy Self-Explainable Prototypical Variational Model

September 15, 2022

Srishti Gautam, Ahcene Boubekki, Stine Hansen, Suaiba Salahuddin, Robert Jenssen, Marina Höhne, Michael Kampffmeyer

Paper abstract

The need for interpretable models has fostered the development of self-explainable classifiers. Prior approaches are either based on multi-stage optimization schemes, impacting the predictive performance of the model, or produce explanations that are not transparent, trustworthy or do not capture the diversity of the data. To address these shortcomings, we propose ProtoVAE, a variational autoencoder-based framework that learns class-specific prototypes in an end-to-end manner and enforces trustworthiness and diversity by regularizing the representation space and introducing an orthonormality constraint. Finally, the model is designed to be transparent by directly incorporating the prototypes into the decision process. Extensive comparisons with previous self-explainable approaches demonstrate the superiority of ProtoVAE, highlighting its ability to generate trustworthy and diverse explanations, while not degrading predictive performance.