In this paper, we address the problem of anomaly detection in decentralised settings. We took inspiration from the current edge computing trend, pushing towards the development of decentralised ML algorithms, i.e., the devices that collected or generated data are in charge of collaborating to train the ML models without sharing raw data . The challenges connected to this scenario are (i) data distributions of local datasets might be different, (ii) data is very often unlabelled, and (iii) devices have limited computational resources. We address them by proposing an unsupervised ensemble method for decentralised anomaly detection where the base learners are lightweight autoencoders. We aim to investigate whether an ensemble of lightweight models trained in isolation on non-IID and unlabelled local data can compete with heavier models trained in centralised settings. In a task of multi-category anomaly detection, our results show that our method exploits the data imbalance successfully to make accurate predictions.

Centralised vs decentralised anomaly detection: when local and imbalanced data are beneficial

Nardi, Mirko;
2021

Abstract

In this paper, we address the problem of anomaly detection in decentralised settings. We took inspiration from the current edge computing trend, pushing towards the development of decentralised ML algorithms, i.e., the devices that collected or generated data are in charge of collaborating to train the ML models without sharing raw data . The challenges connected to this scenario are (i) data distributions of local datasets might be different, (ii) data is very often unlabelled, and (iii) devices have limited computational resources. We address them by proposing an unsupervised ensemble method for decentralised anomaly detection where the base learners are lightweight autoencoders. We aim to investigate whether an ensemble of lightweight models trained in isolation on non-IID and unlabelled local data can compete with heavier models trained in centralised settings. In a task of multi-category anomaly detection, our results show that our method exploits the data imbalance successfully to make accurate predictions.
2021
Settore INF/01 - Informatica
Third International Workshop on Learning with Imbalanced Domains: Theory and Applications - LIDTA 2021 co-located with ECML/PKDD 2021
Il workshop è stato tenuto online
17 settembre 2021
Third International Workshop on Learning with Imbalanced Domains : Theory and Applications, 17 September 2021, ECML-PKDD, Bilbao (Basque Country, Spain)
PMLR
Centralised vs decentralised; unsupervised anomaly detection; data imbalance; autoencoders ensemble
   SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics
   SoBigData-PlusPlus
   European Commission
   Horizon 2020 Framework Programme
   871042

   Social Explainable Artificial Intelligence
   SAI
   CHIST-ERA
   CHIST-ERA-19-XAI-010

   Operational Knowledge from Insights and Analytics on Industrial Data
   OK-INSAID
   MIUR PON
This work is partially funded by the following projects: Operational Knowledge from Insights and Analytics on Industrial Data (MIUR PON OK-INSAID, GA #ARS01 00917) and SoBigData++ (EU H2020, GA #871042), SAI: Social Explainable AI (EC CHIST-ERA-19-XAI-010).
File in questo prodotto:
File Dimensione Formato  
nardi21a.pdf

accesso aperto

Tipologia: Published version
Licenza: Creative Commons
Dimensione 364.3 kB
Formato Adobe PDF
364.3 kB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/145003
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 5
  • OpenAlex ND
social impact