The presence and integration of immigrants is one of the most controversial issues in our society, and given current worldwide political instabilities, it will likely become ever more prominent in the cultural and political debate. Social media play an increasingly important role in how citizens debate opinions and react to local and global events. However, several studies point out the danger of social media as a breeding ground for online hate speech (or cyberhate). We propose a novel approach to the exploratory analysis of social phenomena based on the integration of automatic detection of cyberhate against immigrants with offline indicators. We gathered data from the Italian Twittersphere and from the main supplier of official statistical data in Italy (ISTAT). We developed a supervised classification model for hate speech detection, trained on a corpus of Italian tweets manually annotated for hate speech against immigrants, and use it to automatically annotate a large sample of geo-tagged tweets over a span of six years. We crossed this data with the ISTAT data, exploring three macro-indicators related to employment, education and crime. We found correlations suggesting an interplay between economical and cultural factors and the expression of hate online

Leveraging Hate Speech Detection to Investigate Immigration-related Phenomena in Italy

Lai, Mirko;
2019-01-01

Abstract

The presence and integration of immigrants is one of the most controversial issues in our society, and given current worldwide political instabilities, it will likely become ever more prominent in the cultural and political debate. Social media play an increasingly important role in how citizens debate opinions and react to local and global events. However, several studies point out the danger of social media as a breeding ground for online hate speech (or cyberhate). We propose a novel approach to the exploratory analysis of social phenomena based on the integration of automatic detection of cyberhate against immigrants with offline indicators. We gathered data from the Italian Twittersphere and from the main supplier of official statistical data in Italy (ISTAT). We developed a supervised classification model for hate speech detection, trained on a corpus of Italian tweets manually annotated for hate speech against immigrants, and use it to automatically annotate a large sample of geo-tagged tweets over a span of six years. We crossed this data with the ISTAT data, exploring three macro-indicators related to employment, education and crime. We found correlations suggesting an interplay between economical and cultural factors and the expression of hate online
2019
978-1-7281-3891-6
File in questo prodotto:
File Dimensione Formato  
08925079.pdf

file disponibile solo agli amministratori

Licenza: DRM non definito
Dimensione 6.81 MB
Formato Adobe PDF
6.81 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11579/196198
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 10
  • ???jsp.display-item.citation.isi??? 2
social impact