Troll and divide: the language of online polarization

Abstract The affective animosity between the political left and right has grown steadily in many countries over the past few years, posing a threat to democratic practices and public health. There is a rising concern over the role that “bad actors” or trolls may play in the polarization of online networks. In this research, we examined the processes by which trolls may sow intergroup conflict through polarized rhetoric. We developed a dictionary to assess online polarization by measuring language associated with communications that display partisan bias in their diffusion. We validated the polarized language dictionary in 4 different contexts and across multiple time periods. The polarization dictionary made out-of-set predictions, generalized to both new political contexts (#BlackLivesMatter) and a different social media platform (Reddit), and predicted partisan differences in public opinion polls about COVID-19. Then we analyzed tweets from a known Russian troll source (N = 383,510) and found that their use of polarized language has increased over time. We also compared troll tweets from 3 countries (N = 79,833) and found that they all utilize more polarized language than regular Americans (N = 1,507,300) and trolls have increased their use of polarized rhetoric over time. We also find that polarized language is associated with greater engagement, but this association only holds for politically engaged users (both trolls and regular users). This research clarifies how trolls leverage polarized language and provides an open-source, simple tool for exploration of polarized communications on social media.


Study 1: Development of a Polarization Dictionary Dictionary Development
We built on the work by Brady et al. (2017) which studied the diffusion of controversial political content on Twitter during discussions of climate change, same-sex marriage, and gun control. Their dataset included 24,849 tweets with available information on whether the tweets were polarized (retweeted within one political community), or not (retweeted by a user from the opposing ideology). We performed a differential language analysis, a procedure in which two groups are compared in their frequency of word use (Schwartz et al., 2013), on 80% of their data (the rest was kept for validation). We compared the word use of the polarized cluster vs. the nonpolarized cluster by calculating a chi-square statistic for every word in the data set; resulting in a shortlist of words that were significantly associated with polarization.
In the second step, we manually pruned the list of words (i.e., the dictionary) by filtering out names of individuals (e.g. Bernie) and topical words (e.g., antarctic) that would be unlikely to generalize to other contexts outside the original research (Brady et al., 2017). The full dictionary (N = 256) was judged independently for pruning by two of the authors (A.S. and W.J.B.) and agreement reached a Cohen's κ of 0.61, z = 9.93, p <. 001, 95% CI [.51, .72].
Remaining disagreements were discussed to reach convergence. Words that were associated with depolarization were removed. The pruned version of the dictionary consisted of 57 words.
Next, we used word embeddings, a vectorized representation of words that encompasses semantic fields, to expand the pruned dictionary. This process helped the dictionary capture a greater linguistic space while staying close to the semantic space implied from the dictionary.
The GloVe algorithm (Pennington et al., 2014) utilizes word co-occurrence in large corpora to create embeddings of 200 dimensions. We used a pre-trained GloVe model by Stanford NLP which was built on 2 billion tweets (https://github.com/stanfordnlp/GloVe) to extract the five most semantically-related words to each of the "seed" words from the prior step. For example, the word threat was expanded by the words threats, attacks, terrorism, targets, and threatening.
The fully expanded dictionary contained 232 words.
In the final step, we trimmed proper names (e.g., Obama) and nonsensical additions (e.g., prettylittleliars). This time there was perfect agreement between the raters in applying the two rules, which resulted in the removal of 27 words (the final dictionary contained 205 words; the word lists with raters agreement are found on the Online Repository).

Internal Consistency
Conducting psychometric assessments of dictionaries is a well-known issue in text analysis (Pennebaker et al., 2007). Especially in the context of social media and even more so when using Twitter data, it is important to understand what is the unit of analysis in the psychometric evaluation. To conduct an analysis of internal consistency, we grouped together tweets of the same authors. Originally our training set consisted of 19,841 tweets. After grouping tweets together by authors, the training corpus consisted of 7,963 observations. To assess internal consistency in the binary method (Pennebaker et al., 2007), we calculated a binary occurrence matrix of the dictionary elements wherein each word in the dictionary is considered an item in the "questionnaire" (i.e., the dictionary), and calculated Cronbach's alpha of 0.75, 95% CI [0.75,0.76].

Reddit Analysis
We extracted reddit comments from 36 politically mapped subreddit (Soliman et al., 2019). The list of subreddits and their political orientation is shown in Table S2.
Since many comments on Reddit do not contain more than a title, we combined the title and the body of the message into a unified text variable. We then removed links and emoticons and filtered out deleted or removed messages. Messages in languages other than English were removed as well. Reddit messages were collected through the Pushshift API and using the rreddit R package (Kearney, 2019).

Results.
We applied the dictionary on the Reddit sample (political left, political right and control group) and conducted a one-way between-group ANOVA. Results show a significant effect of political group F(2,49227) = 610.65, p < .001, ηp 2 = .024, which was followed by a planned comparison reported in the main text. The second analysis included a neutral sample (NeutralPolitics) instead of control messages collected from a random sample of popular communities. We applied a one-way between-group ANOVA. As before, results show a significant effect of political group F(2,42633) = 14.51, p < .001, ηp 2 < .001, which was followed by a planned comparison reported in the main text.

Studies 2 and 3: Term frequency-inverse document frequency analysis
To better understand the type of language that drives differences in polarized language between trolls and American controls, we conducted a term frequency-inverse document frequency (tf-idf) analysis; a statistical procedure that marks the word importance in a corpus, based on comparing a word's frequency with its base-rate usage. We selected only the words that appear in the polarization dictionary and ranked them by their tf-idf value. Figure.

Hierarchical Clustering
To conduct a thematic clustering of the polarization dictionary, we extracted GloVe word embeddings (Pennington et al., 2014). We then lemmatized the words in the dictionary, and for every word that shared a lemma, we took the average embedding of that lemma, resulting in 170 words in total for clustering. Next, we conducted hierarchical clustering analysis and cut the clustered at the highest level of division (2). See dendrogram in Figure S2.

Results
We applied the two subsets of the polarization dictionary on the social media messages posted by trolls and a random sample of American users across time. As in Studies 2 and 3, we calculated monthly polarization scores and conducted a weighted linear regression predicting polarized language as a function of time, dictionary subcomponent and their interaction with monthly observations as the weighting factor. We were interested in whether the slope of the two dictionary components differ in each group. While there were no significant interactions in the Russian or Venezuelean groups, we found that in American controls, issue polarization had a positive slope, however not significant b = 0.0004, SE = 0.0005, 95% CI [-0.0005 In the current exploratory study, we showed that the polarization dictionary is composed of different subcomponents that map onto theoretical elements of polarization. In addition, we show that the lack of significant polarization trend in American controls, could be attributed to the different trends in affective and issue polarization. On a closer look, affective polarization showed a significant negative trend, however further inspection revealed the trend is driven by a relatively high value which was given the most weight, namely August 2017. When omitted from the analysis, the negative trend was no longer significant b = -0.001, SE = 0.0005, 95% CI [-0.0017, 0.0001].
Interestingly, in August 2017 the United States had experienced one of most contentious events in its recent history. "Unite the Right" rally in Charlottesville, Virginia was an exemplar of a hyper-polarized event, resulting in a white supremacist killing one person and injuring 19 other people (Tien et al., 2020). Therefore, while contributing to a potentially inaccurate trend, high levels of affective polarization in August 2017 do make sense given the context.     Shaded areas around the regression line denote 95% CI. The size of the dots corresponds to the monthly sample size. Note that the Y-axis is fixed to 0-5, data points exceeding this limit are not shown in the figure; the regression lines take these observations into account. Figure S5. Polarized language predicts retweets in political Russian trolls. The graph depicts the number of retweets predicted for a given tweet as a function of polarized language present in the tweet and type of troll. Bands reflect 95% CIs. For constant Y-axes, see Figure 4.