The presence and integration of immigrants is one of the most controversial issues in our society, and given current worldwide political instabilities, it will likely become ever more prominent in the cultural and political debate. Social media play an increasingly important role in how citizens debate opinions and react to local and global events. However, several studies point out the danger of social media as a breeding ground for online hate speech (or cyberhate). We propose a novel approach to the exploratory analysis of social phenomena based on the integration of automatic detection of cyberhate against immigrants with offline indicators. We gathered data from the Italian Twittersphere and from the main supplier of official statistical data in Italy (ISTAT). We developed a supervised classification model for hate speech detection, trained on a corpus of Italian tweets manually annotated for hate speech against immigrants, and use it to automatically annotate a large sample of geo-tagged tweets over a span of six years. We crossed this data with the ISTAT data, exploring three macro-indicators related to employment, education and crime. We found correlations suggesting an interplay between economical and cultural factors and the expression of hate online
Leveraging Hate Speech Detection to Investigate Immigration-related Phenomena in Italy
Lai, Mirko;
2019-01-01
Abstract
The presence and integration of immigrants is one of the most controversial issues in our society, and given current worldwide political instabilities, it will likely become ever more prominent in the cultural and political debate. Social media play an increasingly important role in how citizens debate opinions and react to local and global events. However, several studies point out the danger of social media as a breeding ground for online hate speech (or cyberhate). We propose a novel approach to the exploratory analysis of social phenomena based on the integration of automatic detection of cyberhate against immigrants with offline indicators. We gathered data from the Italian Twittersphere and from the main supplier of official statistical data in Italy (ISTAT). We developed a supervised classification model for hate speech detection, trained on a corpus of Italian tweets manually annotated for hate speech against immigrants, and use it to automatically annotate a large sample of geo-tagged tweets over a span of six years. We crossed this data with the ISTAT data, exploring three macro-indicators related to employment, education and crime. We found correlations suggesting an interplay between economical and cultural factors and the expression of hate onlineFile | Dimensione | Formato | |
---|---|---|---|
08925079.pdf
file disponibile solo agli amministratori
Licenza:
DRM non definito
Dimensione
6.81 MB
Formato
Adobe PDF
|
6.81 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.