How can social big data help to understand issues related to international migration? Official data such as census, survey and administrative data have been traditionally the main data source to study migration. However, these data have some limitations. They are inconsistent across different nations because countries employ different definitions of international migration and characterisations of migrants. Moreover, collecting traditional data is costly and time consuming, thus tracking instantaneous ows of migrants becomes diffcult. This becomes even harder when tracking emigrants because of the lack of motivation from citizens to declare their departure. In recent years, however, we are provided with other alternative data sources for migration. The availability of social big data such as Facebook, and Twitter data allows us to study social behaviours both at large scale and at a granular level, and to peek into real-world phenomena. Although known to suffer from other types of issues, such as selection bias, these data could bring complementary value to standard statistics. In this work, we employ social big data to study international migration. We try to answer the question through an analysis of various phases of migration, using both traditional data and novel data sources. The first phase includes the journey, and we study migration stocks on Twitter, providing benefits and drawbacks of using such data to study international migration. Here, a generic methodology is developed to identify migrants within the Twitter population. This describes a migrant as a person who has the current residence different from the nationality. The residence is defined as the location where a user spends most of his/her time in a certain year. The nationality is inferred from linguistic and social connections to a migrant's country of origin. This methodology is validated first with an internal gold standard dataset and second with Italian register data and Eurostat, and shows strong performance scores and correlation coefficients. The second phase concerns the integration of migrants in the destination country and attachments to their home country. We explore Twitter data to build a novel methodology to quantify and understand migrants' different integration types. Here, We describe four different integration types which are assimilation, integration, marginalisation and separation using two dimensions: the preservation of links to the home country and culture, i.e. home attachment index, and creation of new links and adoption of cultural traits from the new residence country, i.e. destination attachment index. The two dimensions are validated by performing a null model analysis. It shows significant differences between the actual indices and the null model indices, confirming that the two indices are not produced at random. Lastly, We examine the effect of the presence of migrants on political choices of the natives, using a German case study. Specifically, We are interested in understanding whether exposure to reception centres for asylum-seekers in Berlin affected the votes obtained by the radical right AfD in the 2019 European elections, at the electoral district level. We analyse this relationship at a very small scale based on geo-localization techniques and high-resolution spatial data. We study this in a wide range of contextual conditions, including variables such as districts' socio-economic deprivation, the share of established non-European residents, and the geographical location of the districts. Overall the findings show that exposure to reception centres in Berlin is negatively correlated with the AfD vote share. However, the results show remarkable differences between East and West Berlin and between districts characterised by different levels of socio-economic deprivation. Exposure and AfD vote shares are more strongly correlated in Western districts and in better-of districts. This work is thus aimed at providing a practical contribution to international migration studies by offering novel methods and analyses for identifying, quantifying and understanding dynamics of migration to better shape the policies of international migration.

Essays on International Migration using Big Data Analytics / Kim, Ji Su; relatore: GIANNOTTI, Fosca; relatore esterno: Fagiolo, Giorgio; Scuola Normale Superiore, ciclo 33, 26-May-2021.

Essays on International Migration using Big Data Analytics

KIM, Ji Su
2021

Abstract

How can social big data help to understand issues related to international migration? Official data such as census, survey and administrative data have been traditionally the main data source to study migration. However, these data have some limitations. They are inconsistent across different nations because countries employ different definitions of international migration and characterisations of migrants. Moreover, collecting traditional data is costly and time consuming, thus tracking instantaneous ows of migrants becomes diffcult. This becomes even harder when tracking emigrants because of the lack of motivation from citizens to declare their departure. In recent years, however, we are provided with other alternative data sources for migration. The availability of social big data such as Facebook, and Twitter data allows us to study social behaviours both at large scale and at a granular level, and to peek into real-world phenomena. Although known to suffer from other types of issues, such as selection bias, these data could bring complementary value to standard statistics. In this work, we employ social big data to study international migration. We try to answer the question through an analysis of various phases of migration, using both traditional data and novel data sources. The first phase includes the journey, and we study migration stocks on Twitter, providing benefits and drawbacks of using such data to study international migration. Here, a generic methodology is developed to identify migrants within the Twitter population. This describes a migrant as a person who has the current residence different from the nationality. The residence is defined as the location where a user spends most of his/her time in a certain year. The nationality is inferred from linguistic and social connections to a migrant's country of origin. This methodology is validated first with an internal gold standard dataset and second with Italian register data and Eurostat, and shows strong performance scores and correlation coefficients. The second phase concerns the integration of migrants in the destination country and attachments to their home country. We explore Twitter data to build a novel methodology to quantify and understand migrants' different integration types. Here, We describe four different integration types which are assimilation, integration, marginalisation and separation using two dimensions: the preservation of links to the home country and culture, i.e. home attachment index, and creation of new links and adoption of cultural traits from the new residence country, i.e. destination attachment index. The two dimensions are validated by performing a null model analysis. It shows significant differences between the actual indices and the null model indices, confirming that the two indices are not produced at random. Lastly, We examine the effect of the presence of migrants on political choices of the natives, using a German case study. Specifically, We are interested in understanding whether exposure to reception centres for asylum-seekers in Berlin affected the votes obtained by the radical right AfD in the 2019 European elections, at the electoral district level. We analyse this relationship at a very small scale based on geo-localization techniques and high-resolution spatial data. We study this in a wide range of contextual conditions, including variables such as districts' socio-economic deprivation, the share of established non-European residents, and the geographical location of the districts. Overall the findings show that exposure to reception centres in Berlin is negatively correlated with the AfD vote share. However, the results show remarkable differences between East and West Berlin and between districts characterised by different levels of socio-economic deprivation. Exposure and AfD vote shares are more strongly correlated in Western districts and in better-of districts. This work is thus aimed at providing a practical contribution to international migration studies by offering novel methods and analyses for identifying, quantifying and understanding dynamics of migration to better shape the policies of international migration.
26-mag-2021
Settore SPS/06 - Storia delle Relazioni Internazionali
Settore SPS/11 - Sociologia dei Fenomeni Politici
Settore INF/01 - Informatica
Data Science
33
International migration; Big data; Integration; Voting behaviour; Network analysis
Scuola Normale Superiore
GIANNOTTI, Fosca
Fagiolo, Giorgio
Rapoport, Hiller
Sirbu, Alina
File in questo prodotto:
File Dimensione Formato  
KIM_Thesis_PhD.pdf

Open Access dal 25/05/2022

Tipologia: Tesi PhD
Licenza: Solo Lettura
Dimensione 8.08 MB
Formato Adobe PDF
8.08 MB Adobe PDF

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11384/139905
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact