While sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data.

Sentiment spreading : an epidemic model for lexicon-based sentiment analysis on Twitter

Giannotti, Fosca;
2017

Abstract

While sentiment analysis has received significant attention in the last years, problems still exist when tools need to be applied to microblogging content. This because, typically, the text to be analysed consists of very short messages lacking in structure and semantic context. At the same time, the amount of text produced by online platforms is enormous. So, one needs simple, fast and effective methods in order to be able to efficiently study sentiment in these data. Lexicon-based methods, which use a predefined dictionary of terms tagged with sentiment valences to evaluate sentiment in longer sentences, can be a valid approach. Here we present a method based on epidemic spreading to automatically extend the dictionary used in lexicon-based sentiment analysis, starting from a reduced dictionary and large amounts of Twitter data. The resulting dictionary is shown to contain valences that correlate well with human-annotated sentiment, and to produce tweet sentiment classifications comparable to the original dictionary, with the advantage of being able to tag more tweets than the original. The method is easily extensible to various languages and applicable to large amounts of data.
Settore INF/01 - Informatica
16th International Conference of the Italian Association for Artificial Intelligence (AI*IA)
Bari
2017-11-14 - 2017-11-17
AI*IA 2017 Advances in Artificial Intelligence : XVIth International Conference of the Italian Association for Artificial Intelligence, Bari, Italy, November 14-17, 2017, Proceedings
Springer
978-3-319-70168-4
Artificial intelligence; semantics; social networking (online); epidemic modeling; epidemic spreading; large amounts of data; online platforms; semantic context; sentiment analysis; sentiment classification; short message; data mining
Horizon 2020
Grant number 654024
File in questo prodotto:
File Dimensione Formato  
aiia.pdf

Accesso chiuso

Tipologia: Published version
Licenza: Non pubblico
Dimensione 419.68 kB
Formato Adobe PDF
419.68 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/114486
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? 7
social impact