Concept Embedding Models : Beyond the Accuracy-Explainability Trade-Off

Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts-particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations. Our experiments demonstrate that Concept Embedding Models (1) attain better or competitive task accuracy w.r.t. standard neural models without concepts, (2) provide concept representations capturing meaningful semantics including and beyond their ground truth labels, (3) support test-time concept interventions whose effect in test accuracy surpasses that in standard concept bottleneck models, and (4) scale to real-world conditions where complete concept supervisions are scarce.

Concept Embedding Models : Beyond the Accuracy-Explainability Trade-Off

Espinosa Zarlenga, Mateo;Barbiero, Pietro;Ciravegna, Gabriele;Marra, Giuseppe;Giannini, Francesco;Diligenti, Michelangelo;Shams, Zohreh;Precioso, Frederic;Melacci, Stefano;Weller, Adrian;Liò, Pietro;Jamnik, Mateja

2022

Abstract

Deploying AI-powered systems requires trustworthy models supporting effective human interactions, going beyond raw prediction accuracy. Concept bottleneck models promote trustworthiness by conditioning classification tasks on an intermediate level of human-like concepts. This enables human interventions which can correct mispredicted concepts to improve the model's performance. However, existing concept bottleneck models are unable to find optimal compromises between high task accuracy, robust concept-based explanations, and effective interventions on concepts-particularly in real-world conditions where complete and accurate concept supervisions are scarce. To address this, we propose Concept Embedding Models, a novel family of concept bottleneck models which goes beyond the current accuracy-vs-interpretability trade-off by learning interpretable high-dimensional concept representations. Our experiments demonstrate that Concept Embedding Models (1) attain better or competitive task accuracy w.r.t. standard neural models without concepts, (2) provide concept representations capturing meaningful semantics including and beyond their ground truth labels, (3) support test-time concept interventions whose effect in test accuracy surpasses that in standard concept bottleneck models, and (4) scale to real-world conditions where complete concept supervisions are scarce.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2022
			
	Settore Scientifico Disciplinare (validi dal 09/05/2024)
	
				Settore IINF-05/A - Sistemi di elaborazione delle informazioni
Settore INFO-01/A - Informatica
			
	Titolo del Convegno
	
				36th Conference on Neural Information Processing Systems, NeurIPS 2022
			
	Luogo del Convegno
	
				New Orleans
			
	Periodo del Convegno
	
				Monday, November 28th through Friday December 9th
			
	Titolo del Volume
	
				Advances in Neural Information Processing Systems
			
	Editore
	
				Neural information processing systems foundation
			
	ISBN
	
				9781713871088
			
	Progetti che finanziano la ricerca
	
	Titolo Progetto
	
									Automating Representation Choice for AI Tools
								
	Nome finanziatore
	
										UK Research and Innovation
									
	Finanziamento
	
									EPSRC
								
	N. Contratto
	
									EP/T019603/1
								
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
NeurIPS 2022 - Concept Embedding Models.pdf accesso aperto Tipologia: Published version Licenza: Solo Lettura Dimensione 1.35 MB Formato Adobe PDF	1.35 MB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/147947

Citazioni

ND

49

24

13

social impact