Federated Learning trains models without transferring data outside local clients, but usually relies on non-interpretable Neural Networks. While explanations are essential for model adoption and trustworthiness, conventional explainability techniques require centralized data access, which violates Federated Learning principles.We propose a framework for Interpreting Federated Learning by Aggregating SHAP explanations, iFLASH, which introduces novel aggregation methods that significantly improve explanation quality compared to naive averaging approaches, while preserving data privacy in federated settings. iFLASH enables local Shap explainers on individual clients without exposing raw data by aggregating feature importance values rather than models or gradients. Clients compute and evaluate explanations using performance-based metrics, then send results to the server. The server weighs each client's contribution based on model performance and explanation quality, which aims to produce faithful aggregate explanations. The framework supports various aggregation strategies, adapting to different levels of data imbalance and heterogeneity. Experiments across cross-silo (12-16 clients) and cross-device (50-150 clients) scenarios demonstrate that Faithfulness-based aggregation consistently outperforms uniform averaging in cross-silo settings, EQ1 achieves higher Faithfulness than naive averaging in all the cross-silo configurations across all datasets and distributions, while quality-aware methods perform comparably in cross-device environments where the high number of clients provides sufficient averaging effect. iFLASH explanations align closely with centralised explanations in feature importance ranking and directionality, sometimes achieving better fidelity. Results demonstrate that iFLASH enables accurate, privacy-preserving explanations for domains where data cannot be centralised. We highlight that our proposal has been extensively evaluated also in a cross-device federated setting, a scenario that is overlooked in the explainable AI literature.

Interpreting Federated Learning by Aggregating SHAP Explanations

Bonsignori V.
;
Naretto F.;
2026

Abstract

Federated Learning trains models without transferring data outside local clients, but usually relies on non-interpretable Neural Networks. While explanations are essential for model adoption and trustworthiness, conventional explainability techniques require centralized data access, which violates Federated Learning principles.We propose a framework for Interpreting Federated Learning by Aggregating SHAP explanations, iFLASH, which introduces novel aggregation methods that significantly improve explanation quality compared to naive averaging approaches, while preserving data privacy in federated settings. iFLASH enables local Shap explainers on individual clients without exposing raw data by aggregating feature importance values rather than models or gradients. Clients compute and evaluate explanations using performance-based metrics, then send results to the server. The server weighs each client's contribution based on model performance and explanation quality, which aims to produce faithful aggregate explanations. The framework supports various aggregation strategies, adapting to different levels of data imbalance and heterogeneity. Experiments across cross-silo (12-16 clients) and cross-device (50-150 clients) scenarios demonstrate that Faithfulness-based aggregation consistently outperforms uniform averaging in cross-silo settings, EQ1 achieves higher Faithfulness than naive averaging in all the cross-silo configurations across all datasets and distributions, while quality-aware methods perform comparably in cross-device environments where the high number of clients provides sufficient averaging effect. iFLASH explanations align closely with centralised explanations in feature importance ranking and directionality, sometimes achieving better fidelity. Results demonstrate that iFLASH enables accurate, privacy-preserving explanations for domains where data cannot be centralised. We highlight that our proposal has been extensively evaluated also in a cross-device federated setting, a scenario that is overlooked in the explainable AI literature.
2026
Settore INF/01 - Informatica
Settore INFO-01/A - Informatica
Cross-device; cross-silo; explainable artificial intelligence; faithfulness; federated learning; trustworthy AI
   It takes two to tango: a synergistic approach to human-machine decision making
   TANGO
   European Commission
   Horizon Europe Framework Programme - HORIZON Research and Innovation Actions
   101120763
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/168188
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex ND
social impact