Evaluating the privacy exposure of interpretable global and local explainers

Naretto, Francesca; Monreale, Anna; Giannotti, Fosca

During the last few years, the abundance of data has significantly boosted the performance of Machine Learning models, integrating them into several aspects of daily life. However, the rise of powerful Artificial Intelligence tools has introduced ethical and legal complexities. This paper proposes a computational framework to analyze the ethical and legal dimensions of Machine Learning models, focusing specifically on privacy concerns and interpretability. In fact, recently, the research community proposed privacy attacks able to reveal whether a record was part of the black-box training set or inferring variable values by accessing and querying a Machine Learning model. These attacks highlight privacy vulnerabilities and prove that GDPR regulation might be violated by making data or Machine Learning models accessible. At the same time, the complexity of these models, often labelled as “black-boxes”, has made the development of explanation methods indispensable to enhance trust and facilitate their acceptance and adoption in high-stake scenarios. Our study highlights the trade-off between interpretability and privacy protection. By introducing REVEAL, this paper proposes a framework to evaluate the privacy exposure of black-box models and their surrogate-based explainers, whether local or global. Our methodology is adaptable and applicable across diverse black-box models and various privacy attack scenarios. Through an in-depth analysis, we show that the interpretability layer introduced by explanation models might jeopardize the privacy of individuals in the training data of the black-box, particularly with powerful privacy attacks requiring minimal knowledge but causing significant privacy breaches.

Evaluating the privacy exposure of interpretable global and local explainers

Naretto, Francesca;Monreale, Anna;Giannotti, Fosca

2025

Abstract

During the last few years, the abundance of data has significantly boosted the performance of Machine Learning models, integrating them into several aspects of daily life. However, the rise of powerful Artificial Intelligence tools has introduced ethical and legal complexities. This paper proposes a computational framework to analyze the ethical and legal dimensions of Machine Learning models, focusing specifically on privacy concerns and interpretability. In fact, recently, the research community proposed privacy attacks able to reveal whether a record was part of the black-box training set or inferring variable values by accessing and querying a Machine Learning model. These attacks highlight privacy vulnerabilities and prove that GDPR regulation might be violated by making data or Machine Learning models accessible. At the same time, the complexity of these models, often labelled as “black-boxes”, has made the development of explanation methods indispensable to enhance trust and facilitate their acceptance and adoption in high-stake scenarios. Our study highlights the trade-off between interpretability and privacy protection. By introducing REVEAL, this paper proposes a framework to evaluate the privacy exposure of black-box models and their surrogate-based explainers, whether local or global. Our methodology is adaptable and applicable across diverse black-box models and various privacy attack scenarios. Through an in-depth analysis, we show that the interpretability layer introduced by explanation models might jeopardize the privacy of individuals in the training data of the black-box, particularly with powerful privacy attacks requiring minimal knowledge but causing significant privacy breaches.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2025
			
	Settore Scientifico Disciplinare (validi fino a 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Settore Scientifico Disciplinare (validi dal 09/05/2024)
	
				Settore INFO-01/A - Informatica
			
	Titolo Rivista
	
				TRANSACTIONS ON DATA PRIVACY
			
	Parole chiave
	
				Explainable AI, DATA PRIVACY, ARTIFICIAL INTELLIGENCE
			
	Progetti che finanziano la ricerca
	
	Titolo Progetto
	
									”FAIR - Future Artificial Intelligence Research” - Spoke 1 ”Human-centered AI”
								
	Acronimo
	
									M4C2
								
	Nome finanziatore
	
										European Commission
									
	Finanziamento
	
									PNRR - M4C2 - Investimento 1.3, Partenariato Esteso PE00000013
								
	N. Contratto
	
									PE00000013
								
	Titolo Progetto
	
									SoBigData.it - Strengthening the Italian RI for Social Mining and Big Data Analytics
								
	Nome finanziatore
	
										European Commission
									
	Finanziamento
	
									PNRR
								
	Titolo Progetto
	
									Fairness and Intersectional Non-Discrimination in Human Recommendation
								
	Acronimo
	
									FINDHR
								
	Nome finanziatore
	
										European Commission
									
	Finanziamento
	
									Horizon Europe Framework Programme
								
	N. Contratto
	
									101070212
								
	Titolo Progetto
	
									It takes two to tango: a synergistic approach to human-machine decision making
								
	Acronimo
	
									TANGO
								
	Nome finanziatore
	
										European Commission
									
	N. Contratto
	
									Grant Agreement n. 101120763
								
	Dataset relativi alla pubblicazione
	
	URL
	
									https://www.tdp.cat/issues21/abs.a534a24.php
								
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
tdp.a534a24 (1).pdf accesso aperto Tipologia: Published version Licenza: Creative Commons Dimensione 785.42 kB Formato Adobe PDF	785.42 kB	Adobe PDF