← Back

A quick analysis of Italian 2018 general election candidates' tweets

Written by a romantic on January 10, 2018

Introduction

The following data comes from termometropolitico.it and refer to the period of time 18 December 2017 to 24 December 2017. Only parties with 5%+ have been included.

Party Initials % Leader Orientation
Movimento 5 Stelle M5S 27.7 Luigi di Maio Anti-establishment
Partito Democratico PD 24.2 Matteo Renzi Centre-left
Forza Italia FI 15.7 Silvio Berlusconi†† Centre-right
Lega Nord Lega 13.7 Matteo Salvini Far-right
Libertà e Uguaglianza LeU 6.9 Pietro Grasso Far-left
Fratelli d’Italia FDI 5.2 Giorgia Meloni Far-right

†Many would argue that the de facto leader of the party is Beppe Grillo; as Luigi di Maio won the most recent primary elections, I decided to go with him.

††In the rest of the post I will refer as these politicians simply as candidates, even though Berlusconi may or may not be the actual prime minister candidate of his party.

The Analysis

Data Collection

The data have been collected using Twitter API (via tweepy). Specifically:

  • Used Twitter API to get all the tweets posted by candidates from January 1, 2017 to December 24, 2017 (extremes included). Retweets were ignored
  • The code for data collection is available here
  • For the duration of the analysis the tweets were stored in a local database to avoid re-querying the Twitter API multiple times (both raw and preprocessed data are available on GitHub)

Descriptive Analysis

For each candidate, a number of descriptive statistics were computed. The full code can be found here.

  GiorgiaMeloni PietroGrasso berlusconi luigidimaio matteorenzi matteosalvinimi
num_tweets 617 143 651 420 518 3188
average_length 149.84 147.16 171.84 128.77 126.75 127.00
average_hashes 1.84 1.03 1.03 0.63 1.23 1.28
average_mentions 0.33 0.54 0.23 0.36 0.14 0.11
average_links 0.99 0.92 0.33 0.92 0.58 0.74

Note: because of the way Twitter works, average_links refers to both external links and images.

Sentiment Analysis

I used the Polyglot package to compute some rough sentiment scores for each candidate’s tweets. I chose Polyglot as it’s one of the few packages to offer localization in Italian. Note that Polyglot only offers a polarity score (-1.0, 0.0 or +1.0) for words. Sentiment scores were then computed by averaging the polarity scores for each token.

Specifically:

  • Loaded the preprocessed data from TinyDB
  • Computed polarity for each token via Polyglot
  • For each tweet an average polarity was computed (ignoring tokens with polarity equal to zero)
  • Tweets with an average polarity (i.e. sentiment score) smaller than zero were labeled as negative, with an average polarity equal to zero neutral and with an average polarity higher than zero positive
  • The code is available here

The results are summarized below. Candidates are presented from far-left to far-right, with anti-establishment party M5S in the middle.

Sentiment Scores for Candidates' Tweets (in %) 0 0 0.1 0.1 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.6 0.7 0.7 0.8 0.8 0.9 0.9 1 1 PietroGrasso matteorenzi luigidimaio berlusconi GiorgiaMeloni matteosalvinimi 0.2097902098 63.86794871794872 452.27299623453473 PietroGrasso 0.1756756757 166.8807692307692 460.70322245322245 matteorenzi 0.2166666667 269.8935897435897 450.57371794871796 luigidimaio 0.2549923195 372.9064102564102 441.10285950608534 berlusconi 0.2544570502 475.9192307692307 441.2351327764618 GiorgiaMeloni 0.2694479297 578.9320512820512 437.5306558247273 matteosalvinimi 0.2307692308 63.86794871794872 343.40398063474987 PietroGrasso 0.2857142857 166.8807692307692 346.6866646866647 matteorenzi 0.3357142857 269.8935897435897 314.0718864468864 luigidimaio 0.2273425499 372.9064102564102 321.9104927330734 berlusconi 0.2884927066 475.9192307692307 307.06389477621246 GiorgiaMeloni 0.3528858218 578.9320512820512 283.74241144677154 matteosalvinimi 0.5594405594 63.86794871794872 148.13098440021517 PietroGrasso 0.5386100386 166.8807692307692 142.98344223344225 matteorenzi 0.4476190476 269.8935897435897 120.49816849816852 luigidimaio 0.5176651306 372.9064102564102 137.8076332269881 berlusconi 0.4570502431 475.9192307692307 122.8287619997507 GiorgiaMeloni 0.3776662484 578.9320512820512 103.21175562204422 matteosalvinimi Sentiment Scores for Candidates' Tweets (in %) Negative Neutral Positive


Right parties seem to have an higher percentage of negative tweets. While this could be the result of a precise communication strategy, it’s also important to note that the government was left-wing in 2017 and thus it makes sense for right parties to be more critical about the overall economical and political state of the country.

Keywords Analysis

For each candidate, a list of 25 keywords was computed by analysing their tweets and comparing them against the other five candidates.

Specifically:

  • Loaded the preoprocessed data from TinyDB
  • Created a single string containing for each candidate all of her or his tweets
  • Computed the tfidf matrix
  • Selected the top 25 words with highest tfidf score for each candidate

The results are summarized below. Candidates are presented from far-left to far-right, with anti-establishment party M5S in the middle.

PietroGrasso matteorenzi luigidimaio berlusconi GiorgiaMeloni matteosalvinimi
storiedisangueamiciefantasmi avanti renzi lintervista italia salvini
grazie lingotto governo elezionisicilia governo lega
vittime trenopd oggi tgcom24 oggi italia
senato lavoro diretta italia amministrative2017 stopinvasione
anni oggi sceglieteilfuturo musumecipresidente sindaco italiani
oggi italia italia italiani intervista primagliitaliani
palermo assembleapd stelle stato italiasovrana andiamoagovernare
mafia insieme rally portaaporta renzi dimartedi
ricordo scuolapd movimento settegiorni italiani lintervista
presto futuro sindaci5stelle paese roma live
esempio millegiorni ospite confapi immigrati ottoemezzo
libera portaaporta voto governo piazza amici
ucciso europa sicilia europa tempodipatrioti governo
libreria grazie tour politica appelloaipatrioti portaaporta
impegno istat grazie matrix diretta congressolega
italia perché solo programma anni anni
legge finestra paese anni aspetto renzi
stato scienza prima tasse fratelli gabbiaopen
9maggio politica grande chetempochefa europa casa
liberieuguali democratica italiani molto immigrazione video
piolatorre andiamo legge solo sostenere agorarai
bellissimo cosa sera fatto seguitemi immigrati
voce prima lotti lavoro atreju17 diretta
auguri tempo renziconfessa fiscale candidatura matrix
insieme leopolda rispettoperzuccaro oggi nazionale facciamosquadra

Few observations in random order:

  • Many keywords in Grasso’s vocabulary refer to Mafia (e.g., victims, 9may, Palermo, mafia, killed, etc.), which makes sense given that Grasso has been for many years Prosecutor at the Court of Palermo;
  • Luigi Di Maio, Giorgia Meloni and Matteo Salvini all have renzi in their vocabulary (referring to Matteo Renzi, leader of PD). Interestingly, Berlusconi doesn’t;
  • Unsurprisingly, the two far-right parties all have many immigration-related keywords (e.g., immigrants, immigration, stoptheinvasion, italiansfirst, etc.);
  • Salvini is the only candidate to have his own name as a keyword. This may be simply the result of him having named his “sub-party” Noi con Salvini (en: Us With Salvini).

It’s also interesting to note how candidates do not seem to be communicating about the same issues from different points of view (e.g., immigration is good vs. immigration is bad) as much as talking about completely different topics. In other words, nobody seems to be offering an opposite narrative to other candidates’.