Counterfactual and Prototypical Explanations for Tabular Data via Interpretable Latent Space

Artificial Intelligence decision-making systems have dramatically increased their predictive power in recent years, beating humans in many different specific tasks. However, with increased performance has come an increase in the complexity of the black-box models adopted by the AI systems, making them entirely obscure for the decision process adopted. Explainable AI is a field that seeks to make AI decisions more transparent by producing explanations. In this paper, we propose CP-ILS, a comprehensive interpretable feature reduction method for tabular data capable of generating Counterfactual and Prototypical post-hoc explanations using an Interpretable Latent Space. CP-ILS optimizes a transparent feature space whose similarity and linearity properties enable the easy extraction of local and global explanations for any pre-trained black-box model, in the form of counterfactual/prototype pairs. We evaluated the effectiveness of the created latent space by showing its capability to preserve pair-wise similarities like well-known dimensionality reduction techniques. Moreover, we assessed the quality of counterfactuals and prototypes generated with CP-ILS against state-of-the-art explainers, demonstrating that our approach obtains more robust, plausible, and accurate explanations than its competitors under most experimental conditions.

Counterfactual and Prototypical Explanations for Tabular Data via Interpretable Latent Space

Piaggesi, Simone;Bodria, Francesco;Guidotti, Riccardo;Giannotti, Fosca;Pedreschi, Dino

2024

Abstract

Artificial Intelligence decision-making systems have dramatically increased their predictive power in recent years, beating humans in many different specific tasks. However, with increased performance has come an increase in the complexity of the black-box models adopted by the AI systems, making them entirely obscure for the decision process adopted. Explainable AI is a field that seeks to make AI decisions more transparent by producing explanations. In this paper, we propose CP-ILS, a comprehensive interpretable feature reduction method for tabular data capable of generating Counterfactual and Prototypical post-hoc explanations using an Interpretable Latent Space. CP-ILS optimizes a transparent feature space whose similarity and linearity properties enable the easy extraction of local and global explanations for any pre-trained black-box model, in the form of counterfactual/prototype pairs. We evaluated the effectiveness of the created latent space by showing its capability to preserve pair-wise similarities like well-known dimensionality reduction techniques. Moreover, we assessed the quality of counterfactuals and prototypes generated with CP-ILS against state-of-the-art explainers, demonstrating that our approach obtains more robust, plausible, and accurate explanations than its competitors under most experimental conditions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Settore Scientifico Disciplinare (validi fino a 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Settore Scientifico Disciplinare (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Titolo Rivista
	
				IEEE ACCESS
			
	DOI
	
				https://dx.doi.org/10.1109/access.2024.3496114
			
	Parole chiave
	
				counterfactual search; Learning latent representations; learning linear models; neural networks; prototype search
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Counterfactual_and_Prototypical_Explanations_for_Tabular_Data_via_Interpretable_Latent_Space.pdf accesso aperto Tipologia: Published version Licenza: Creative Commons Dimensione 1.9 MB Formato Adobe PDF	1.9 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/161023

Citazioni

ND

3

2

3

social impact