Extracting Active "Ego Networks" of Words: Methodology, Robustness, and Cross-Domain Validation

Ollivier, Kilian; Boldrini, Chiara; Passarella, Andrea; Conti, Marco

doi:10.5281/zenodo.10182374

The "ego network of words" model captures structural properties in language production associated with cognitive constraints. While previous research focused on the layer-based structure and its semantic properties, this paper argues that an essential element, the concept of an active network, is missing. Drawing inspiration from social ego networks, where the active part includes relationships regularly nurtured by individuals, we establish the notion of an active ego network of words. We demonstrate that without the active network concept, an ego network becomes vulnerable to the amount of data considered, leading to the disappearance of the layered structure in larger datasets. To address this, we define a methodology for extracting the active part of the ego network of words and validate it using interview transcripts and tweets. The robustness of our method to varying input data sizes and temporal stability is demonstrated. In addition, our results are well-aligned with prior analyses of the ego network of words, where the limitation of the data collected led automatically (and implicitly) to approximately consider the active part of the network only. Moreover, the validation on the transcripts dataset (MediaSum) highlights the generalizability of the model across diverse domains and the ingrained cognitive constraints in language usage.

Extracting Active "Ego Networks" of Words: Methodology, Robustness, and Cross-Domain Validation

Ollivier, Kilian;Boldrini, Chiara;Passarella, Andrea;Conti, Marco

2023

Abstract

The "ego network of words" model captures structural properties in language production associated with cognitive constraints. While previous research focused on the layer-based structure and its semantic properties, this paper argues that an essential element, the concept of an active network, is missing. Drawing inspiration from social ego networks, where the active part includes relationships regularly nurtured by individuals, we establish the notion of an active ego network of words. We demonstrate that without the active network concept, an ego network becomes vulnerable to the amount of data considered, leading to the disappearance of the layered structure in larger datasets. To address this, we define a methodology for extracting the active part of the ego network of words and validate it using interview transcripts and tweets. The robustness of our method to varying input data sizes and temporal stability is demonstrated. In addition, our results are well-aligned with prior analyses of the ego network of words, where the limitation of the data collected led automatically (and implicitly) to approximately consider the active part of the network only. Moreover, the validation on the transcripts dataset (MediaSum) highlights the generalizability of the model across diverse domains and the ingrained cognitive constraints in language usage.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Settore Scientifico Disciplinare (validi fino a 24/06/2024)
	
				Settore INF/01 - Informatica
			
	Titolo Rivista
	
				IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS
			
	DOI
	
				https://dx.doi.org/10.5281/zenodo.10182374
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

File	Dimensione	Formato
OLLIVIER_paper3.pdf Accesso chiuso Descrizione: Paper Tipologia: Submitted version (pre-print) Licenza: Tutti i diritti riservati Dimensione 1.23 MB Formato Adobe PDF Richiedi una copia	1.23 MB	Adobe PDF	Richiedi una copia