Centralised vs decentralised anomaly detection: when local and imbalanced data are beneficial

In this paper, we address the problem of anomaly detection in decentralised settings. We took inspiration from the current edge computing trend, pushing towards the development of decentralised ML algorithms, i.e., the devices that collected or generated data are in charge of collaborating to train the ML models without sharing raw data . The challenges connected to this scenario are (i) data distributions of local datasets might be different, (ii) data is very often unlabelled, and (iii) devices have limited computational resources. We address them by proposing an unsupervised ensemble method for decentralised anomaly detection where the base learners are lightweight autoencoders. We aim to investigate whether an ensemble of lightweight models trained in isolation on non-IID and unlabelled local data can compete with heavier models trained in centralised settings. In a task of multi-category anomaly detection, our results show that our method exploits the data imbalance successfully to make accurate predictions.

Centralised vs decentralised anomaly detection: when local and imbalanced data are beneficial

Nardi, Mirko;Valerio, Lorenzo;Passarella, Andrea

2021

Abstract

In this paper, we address the problem of anomaly detection in decentralised settings. We took inspiration from the current edge computing trend, pushing towards the development of decentralised ML algorithms, i.e., the devices that collected or generated data are in charge of collaborating to train the ML models without sharing raw data . The challenges connected to this scenario are (i) data distributions of local datasets might be different, (ii) data is very often unlabelled, and (iii) devices have limited computational resources. We address them by proposing an unsupervised ensemble method for decentralised anomaly detection where the base learners are lightweight autoencoders. We aim to investigate whether an ensemble of lightweight models trained in isolation on non-IID and unlabelled local data can compete with heavier models trained in centralised settings. In a task of multi-category anomaly detection, our results show that our method exploits the data imbalance successfully to make accurate predictions.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2021
			
	Settore Scientifico Disciplinare (validi fino a 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Titolo del Convegno
	
				Third International Workshop on Learning with Imbalanced Domains: Theory and Applications - LIDTA 2021 co-located with ECML/PKDD 2021
			
	Luogo del Convegno
	
				Il workshop è stato tenuto online
			
	Periodo del Convegno
	
				17 settembre 2021
			
	Titolo del Volume
	
				Third International Workshop on Learning with Imbalanced Domains : Theory and Applications, 17 September 2021, ECML-PKDD, Bilbao (Basque Country, Spain)
			
	Editore
	
				PMLR
			
	Parole chiave
	
				Centralised vs decentralised; unsupervised anomaly detection; data imbalance; autoencoders ensemble
			
	Progetti che finanziano la ricerca
	
	Titolo Progetto
	
									SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics
								
	Acronimo
	
									SoBigData-PlusPlus
								
	Nome finanziatore
	
										European Commission
									
	Finanziamento
	
									Horizon 2020 Framework Programme
								
	N. Contratto
	
									871042
								
	Titolo Progetto
	
									Social Explainable Artificial Intelligence
								
	Acronimo
	
									SAI
								
	Nome finanziatore
	
										CHIST-ERA
									
	N. Contratto
	
									CHIST-ERA-19-XAI-010
								
	Titolo Progetto
	
									Operational Knowledge from Insights and Analytics on Industrial Data
								
	Acronimo
	
									OK-INSAID
								
	Nome finanziatore
	
										MIUR PON
									
	Informazioni sul finanziamento della ricerca
	
				This work is partially funded by the following projects: Operational Knowledge from Insights and Analytics on Industrial Data (MIUR PON OK-INSAID, GA #ARS01 00917) and SoBigData++ (EU H2020, GA #871042), SAI: Social Explainable AI (EC CHIST-ERA-19-XAI-010).
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
nardi21a.pdf accesso aperto Tipologia: Published version Licenza: Creative Commons Dimensione 364.3 kB Formato Adobe PDF	364.3 kB	Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/145003

Citazioni

ND

7

6

ND

social impact