Abstaining classifiers have the option to refrain from providing a prediction for instances that are difficult to classify. The abstention mechanism is designed to trade off the classifier’s performance on the accepted data while ensuring a minimum number of predictions. In this setting, often fairness concerns arise when the abstention mechanism solely reduces errors for the majority groups of the data, resulting in increased performance differences across demographic groups. While there exist a bunch of methods that aim to reduce discrimination when abstaining, there is no mechanism that can do so in an explainable way. In this paper, we fill this gap by introducing Interpretable and Fair Abstaining Classifier (IFAC), an algorithm that can reject predictions both based on their uncertainty and their unfairness. By rejecting possibly unfair predictions, our method reduces error and positive decision rate differences across demographic groups of the non-rejected data. Since the unfairness-based rejections are based on an interpretable-by-design method, i.e., rule-based fairness checks and situation testing, we create a transparent process that can empower human decision-makers to review the unfair predictions and make more just decisions for them. This explainable aspect is especially important in light of recent AI regulations, mandating that any high-risk decision task should be overseen by human experts to reduce discrimination risks. (Code and Appendix for this work is available on: https://github.com/calathea21/IFAC).

Interpretable and Fair Mechanisms for Abstaining Classifiers

Lenders, Daphne
;
Pugnana, Andrea;Pellungrini, Roberto;Pedreschi, Dino;Giannotti, Fosca
2024

Abstract

Abstaining classifiers have the option to refrain from providing a prediction for instances that are difficult to classify. The abstention mechanism is designed to trade off the classifier’s performance on the accepted data while ensuring a minimum number of predictions. In this setting, often fairness concerns arise when the abstention mechanism solely reduces errors for the majority groups of the data, resulting in increased performance differences across demographic groups. While there exist a bunch of methods that aim to reduce discrimination when abstaining, there is no mechanism that can do so in an explainable way. In this paper, we fill this gap by introducing Interpretable and Fair Abstaining Classifier (IFAC), an algorithm that can reject predictions both based on their uncertainty and their unfairness. By rejecting possibly unfair predictions, our method reduces error and positive decision rate differences across demographic groups of the non-rejected data. Since the unfairness-based rejections are based on an interpretable-by-design method, i.e., rule-based fairness checks and situation testing, we create a transparent process that can empower human decision-makers to review the unfair predictions and make more just decisions for them. This explainable aspect is especially important in light of recent AI regulations, mandating that any high-risk decision task should be overseen by human experts to reduce discrimination risks. (Code and Appendix for this work is available on: https://github.com/calathea21/IFAC).
2024
Settore INFO-01/A - Informatica
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
Vilnius
9-13 Settembre 2024
Machine Learning and Knowledge Discovery in Databases : Research Track : European Conference, ECML PKDD 2024, Vilnius, Lithuania, September 9–13, 2024, Proceedings, Part VII
Springer
9783031703676
9783031703683
Fair ML; Interpretable ML; Reject Option
   Science and technology for the explanation of AI decision making
   XAI
   European Commission
   H2020
   834756

   PNRR Partenariati Estesi - FAIR - Future artificial intelligence research.
   Ministero della pubblica istruzione, dell'università e della ricerca

   It takes two to tango: a synergistic approach to human-machine decision making
   TANGO
   European Commission
   Grant Agreement n. 101120763

   PNRR Infrastrutture di Ricerca - SoBigData.it - Strengthening the Italian RI for Social Mining and Big Data Analytics.
   SoBigData.it
   Ministero della pubblica istruzione, dell'università e della ricerca
   IR0000013
File in questo prodotto:
File Dimensione Formato  
978-3-031-70368-3_25.pdf

Accesso chiuso

Tipologia: Published version
Licenza: Non pubblico
Dimensione 884.84 kB
Formato Adobe PDF
884.84 kB Adobe PDF   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/151923
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
  • OpenAlex ND
social impact