Many dimensionality reduction methods have been introduced to map a data space into one with fewer features and enhance machine learning models’ capabilities. This reduced space, called latent space, holds properties that allow researchers to understand the data better and produce better models. This work proposes an interpretable latent space that preserves the similarity of data points and supports a new way of learning a classification model that allows prediction and explanation through counterfactual examples. We demonstrate with extensive experiments the effectiveness of the latent space with respect to different metrics in comparison with several competitors, as well as the quality of the achieved counterfactual explanations.
Interpretable Latent Space to Enable Counterfactual Explanations
Bodria, Francesco;Giannotti, Fosca;Pedreschi, Dino
2022
Abstract
Many dimensionality reduction methods have been introduced to map a data space into one with fewer features and enhance machine learning models’ capabilities. This reduced space, called latent space, holds properties that allow researchers to understand the data better and produce better models. This work proposes an interpretable latent space that preserves the similarity of data points and supports a new way of learning a classification model that allows prediction and explanation through counterfactual examples. We demonstrate with extensive experiments the effectiveness of the latent space with respect to different metrics in comparison with several competitors, as well as the quality of the achieved counterfactual explanations.File | Dimensione | Formato | |
---|---|---|---|
ds2022ils.pdf
Open Access dal 07/11/2024
Tipologia:
Accepted version (post-print)
Licenza:
Solo Lettura
Dimensione
1.25 MB
Formato
Adobe PDF
|
1.25 MB | Adobe PDF |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.