Factual and Counterfactual Explanations for Black Box Decision Making

The rise of sophisticated machine learning models has brought accurate but obscure decision systems, which hide their logic, thus undermining transparency, trust, and the adoption of artificial intelligence (AI) in socially sensitive and safety-critical contexts. We introduce a local rule-based explanation method, providing faithful explanations of the decision made by a black box classifier on a specific instance. The proposed method first learns an interpretable, local classifier on a synthetic neighborhood of the instance under investigation, generated by a genetic algorithm. Then, it derives from the interpretable classifier an explanation consisting of a decision rule, explaining the factual reasons of the decision, and a set of counterfactuals, suggesting the changes in the instance features that would lead to a different outcome. Experimental results show that the proposed method outperforms existing approaches in terms of the quality of the explanations and of the accuracy in mimicking the black box.

Factual and Counterfactual Explanations for Black Box Decision Making

Guidotti R.;Monreale A.;Giannotti F.;Pedreschi D.;Ruggieri S.;Turini F.

2019

Abstract

The rise of sophisticated machine learning models has brought accurate but obscure decision systems, which hide their logic, thus undermining transparency, trust, and the adoption of artificial intelligence (AI) in socially sensitive and safety-critical contexts. We introduce a local rule-based explanation method, providing faithful explanations of the decision made by a black box classifier on a specific instance. The proposed method first learns an interpretable, local classifier on a synthetic neighborhood of the instance under investigation, generated by a genetic algorithm. Then, it derives from the interpretable classifier an explanation consisting of a decision rule, explaining the factual reasons of the decision, and a set of counterfactuals, suggesting the changes in the instance features that would lead to a different outcome. Experimental results show that the proposed method outperforms existing approaches in terms of the quality of the explanations and of the accuracy in mimicking the black box.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2019
			
	Settore Scientifico Disciplinare (validi fino a 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Titolo Rivista
	
				IEEE INTELLIGENT SYSTEMS
			
	DOI
	
				https://dx.doi.org/10.1109/MIS.2019.2957223
			
	Parole chiave
	
				Counterfactuals; Explainable AI; Explanation Rules; Interpretable Machine Learning; Open the Black Box
			
	Progetti che finanziano la ricerca
	
	Finanziamento
	
									Horizon 2020
								
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
Factual_and_Counterfactual_Explanations_for_Black_Box_Decision_Making.pdf Accesso chiuso Tipologia: Published version Licenza: Non pubblico Dimensione 2.46 MB Formato Adobe PDF Richiedi una copia	2.46 MB	Adobe PDF	Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/110550

Citazioni

ND

252

175

ND

social impact