Entropy-based detection of Twitter echo chambers

Abstract Echo chambers, i.e. clusters of users exposed to news and opinions in line with their previous beliefs, were observed in many online debates on social platforms. We propose a completely unbiased entropy-based method for detecting echo chambers. The method is completely agnostic to the nature of the data. In the Italian Twitter debate about the Covid-19 vaccination, we find a limited presence of users in echo chambers (about 0.35% of all users). Nevertheless, their impact on the formation of a common discourse is strong, as users in echo chambers are responsible for nearly a third of the retweets in the original dataset. Moreover, in the case study observed, echo chambers appear to be a receptacle for disinformative content.


Introduction
In the virtual world, the tendency to seek out information that confirms existing beliefs and to interact with users who share similar opinions leads to the formation of echo chambers, i.e., 'bounded, enclosed media spaces that have the potential to both amplify the messages delivered within it and insulate them from rebuttal' [1][2][3][4].A more detailed review of the literature about echo chambers in online social networks can be found in Section 1 of the Supplementary Material.We thus have two key events in echo chamber formation: i) interaction between users with similar opinions; ii) exposure of users to the same news articles.
This paper studies the actual presence of echo chambers in social networks by detecting the overlap of the two events.The detection is done by adopting an entropybased technique.The platform considered in this study is Twitter/X.From now on, we will refer to the platform by its former name, Twitter, since the analyses were conducted before the change in the company name to X.
Recently, entropy-based null models have been introduced in studies of complex networks as an unbiased benchmark capable of revealing non-trivial structures of real systems [5], and thus they represent the appropriate framework for our analysis.Fig. 1 shows how we intend to assess the occurrence of the two events, and, consequently, the occurrence of the echo chamber.Assessing the opinions of the various accounts is not an easy task, but it can be inferred from the interaction among the various accounts.In Ref.s [6][7][8][9] a method to infer the presence of a discursive community, i.e. a group of accounts contributing to the formation of a common discourse, was presented.It is based on verified users, i.e. those accounts for which Twitter has a procedure to check the identity of their owners.Verified accounts mainly belong to politicians, journalists, and celebrities; usually, they are strong creators of contents [6][7][8].Verified users are among the greatest contributors to the formation of a common discourse.It is possible, then, to let similarities emerge among the content created by verified users, based on the behavior of their common audience in terms of retweets, since retweets are considered a measure of engagement [10][11][12].In detail, for each pair of verified users, the number of common retweeters is counted.If the number is statistically significant with respect to an entropy-based benchmark, it is validated and we project a link between the verified users' pair.On the monopartite network of verified users thus obtained, we run a community detection algorithm to extract groups of similar verified users (i.e., the Verified users' DiCos in Fig 1).Then, the various communities of verified users are labeled in terms of the users who belong to them (since the users are verified, it is possible, for example, to derive their political leanings and test a posteriori the resulting communities).At this point, the labels are extended to unverified users using a label propagation algorithm [13] on the entire retweet network -thus encompassing both verified and unverified users.
Once again, the use of the retweet network for label propagation is motivated by the fact that there is evidence that users belonging to communities in a retweet network share similar views [10][11][12].In the following, such communities will be called discursive communities (or DiCo), and their detection is sketched in the top path of Fig. 1.Discursive communities embrace those users who contribute to the formation of a common discourse.
Regarding the exposure to the same news articles, we approach its assessment by analyzing the ties between the users and the URLs present in their tweets and retweets.The bottom path of Fig. 1 shows the approach.The idea of leveraging the bipartite network of users and URLs was already considered in Ref. [14] for Facebook: in the present case, we translate the idea therein to Twitter.Again, the procedure goes through a comparison between observations and an entropy-based benchmark: if two users tweeted (or retweeted) the same URLs significantly more than the benchmark, we conclude that the two users share the same information diet in a statistically significant way.We can thus identify groups of users sharing the same URLs.In the following, user communities that passed the validation are called news engagement communities of users, for short user NECs.User NECs contextualize the second event: exposure of users to the same news articles.Now, we were able to identify groups of users exposed to the same news articles (user NEC) and groups of users who share a common discourse (DiCo).Users who share a group of the first type and a group of the second type form an echo chamber, provided they interact with each other.The interaction for us is that of retweets since retweets are considered as a form of endorsement to the content created by others [6,[10][11][12].Verifying user interactions is an important step because accounts belonging to the same user NEC may either not belong to the same DiCo or, even in the case where they are in the same discursive community, may not interact with each other.In this sense, only users who i) belong to the same user NEC and ii) belong to the same DiCo and iii) are connected, even indirectly, through retweets (i.e., they form a weakly connected component in the retweet network) can be said to represent an echo chamber.
As a case study for evaluating the presence of echo chambers, we consider the online debate on Twitter regarding the Covid-19 vaccination campaign.Surprisingly, compared to numerous examples found in the literature, we find a limited presence of echo chambers in the analyzed dataset, mainly due to the small dimensions of users' NECs.Although the detected echo chambers are composed of a small number of users with respect to the total number of active users, they play a significant role in terms of retweet interactions, i.e. the echo chambers that emerged in the case study of the Covid-19 vaccination debate have a significant impact on the creation of a common and cohesive discourse that is not devoid of disinformation.Furthermore, users who belong to such echo chambers show the same ideas and opinions after years.

Contributions:
The main contribution of this paper is a novel unbiased method for echo chamber detection.The procedure is based on the very definition of echo chambers and involves the application of an entropy-based null model to discard signals assimilated to noise.

Research questions:
Keeping in mind that our ultimate goal is to observe if and when discursive communities and news engagement communities of users overlap, thus forming echo chambers, we organize the structure of the paper to answer the following research questions (RQs): • RQ.1:What are the characteristics of the discursive communities (DiCos) and of the news engagement communities of users (users' NECs)?Are there users in common?; • RQ.2:What is the relation between the emergent echo chambers and the presence of disinformation, if any?

Dataset
Our dataset consists of ∼1.87M tweets in Italian and 136k users; nearly ∼220k tweets contain URLs.We relied on the Twitter's streaming API and data were collected from September 1 st to September 24 th 2021.The data collection was keyword-based and related to the COVID-19 vaccination online debate.The keywords are compatible with chronicles regarding the vaccination debate in Italy at time of data collection.We remind the reader that the Twitter's streaming API returns any tweet containing those terms in the text of the tweet, as well as in its metadata.It is worth noting that it is not always necessary to have each permutation of a specific keyword in the tracking list.For example, the keyword 'COVID' would return tweets that contain also both 'COVID19' and 'COVID-19'.The keywords for the data collection are in Section 2 of the Supplementary Information.

Discursive Communities
Fig. 2 (top) describes the characteristics of the main discursive communities (DiCos) that emerge from the data.We recall that it is possible to assign labels to verified accounts, as the identity of their owner has been certified by the platform.Starting from the original dataset, we run the community detection algorithm [15] on the validated network of verified users and the label propagation algorithm [13] on the network of retweets of the different communities.

News Engagement Communities (NECs) of users
Table 1 shows that of all users who have published at least one post with a URL (∼ 33k), only 566 are part of a user NEC, which is less than 2%.Accounts in user NECs are proportionally much more active in publishing URLs than users not validated by our procedure (67.7 vs. 5.90 URLs per account).The left panel of Fig. 3 shows how the 566 users cluster into different user NECs, while the right panel provides a statistical view of the 566 users associated with the user NEC.On the right, the top doughnut chart illustrates the largest communities based on the number of users.Each of these prominent user NECs (IDs 0, 1, 2, 3, 4, 5, and 6) accounts for at least 95% of the total user population within this type of community.Furthermore, the lower doughnut chart shows that these communities have the highest frequency of tweets containing URLs.Communities 1, 2, 3, 4, and 5 collectively account for over 78% of the total URL traffic generated by all user NEC communities.An analogous analysis of the URL NECs, i.e. the community detected on the validated projection on the layer of the URLs can be found in Section 3 of the Supplementary Information.

Echo chambers
Our analysis shows that all but 1 of the 566 users in the user NECs are also part of the same discursive community, i.e.FdI-L-Media.This is the discursive community with users affiliated with political parties Fratelli D'Italia and Lega, and news outlets showing similar leanings.However, the fact that all users in the user NECs belong to the same DiCo only tells us that users with similar 'information diets' contribute to the formation of the same discourse, but not that they influence each other and reinforce the opinions of their siblings.In other words, users who refer to the same news sources may never meet on the platform.In fact, the information about who interacts with whom is not used to detect user NECs.
As mentioned in the introduction, users in an echo chamber are users who share a common discourse, are exposed to the same news sources, and are exposed to the same opinions.Being exposed to the same opinions, translated to Twitter, means that they retweet each other.In this sense, if users in the same user NEC form a (weakly) connected component in the same DiCo-induced subgraph of the retweet network (i.e., if there is a flow of influence in the retweet network that is restricted to nodes in the same discursive community), they form an echo chamber.
The analysis of the weakly connected component shows that 92 users do not belong to it.This leaves 473 users trapped in echo chambers.In particular, all users in user NECs 8, 9, and 10 did not retweet others in the same user NEC on the topic under analysis.Regarding the other user NECs, we observe that for each of them, most of the nodes form echo chambers.In the following, echo chambers inherit the ID of their user NEC.Some echo chambers are relatively large: for example, those induced by user NECs 1 and 2 contain more than 100 nodes.
To study how much users in echo chambers are connected, we use the undirected clustering coefficient: ignoring the direction of the edges, it captures the observed frequency of interactions between the neighbors of each node [16].
We compare the clustering coefficients of the echo chambers with the one measured on the Largest Weakly Connected Component (LWCC) of the retweet network restricted to users in the FdI-L-Media DiCo.In this way, we have a benchmark that captures the main contribution to the discourse to which the echo chambers belong.
The clustering coefficient associated with users in echo chambers is more than three times as high as that for other users within the LWCC (0.56 compared to 0.16, left panel of Fig. 4).We then examine the average clustering coefficient within each echo chamber.The right panel of Fig. 4 shows that the average clustering coefficients of echo chambers 2, 4, and 11 are greater than 0.6.
High values of the clustering coefficient imply that accounts are highly connected and frequently retweet each other.Therefore, we can conclude that their endorsement activity contributes to the reinforcement of their opinions.Such a conclusion is confirmed by a manual examination of the content shared by users in echo chambers after almost 2 years.At the time of the data collection, the opinions of the users were strongly against the Covid-19 vaccination.After 2 years, the positions of the users there still adhere to conspiracy theories and have become particularly extreme.
To provide a concrete example, we will focus on the content shared in echo chamber 4. In practice, we first manually extract the main narratives from the news shared within echo chamber 4, focusing on the users with the highest number of followers at the time of data collection.Then, still focusing on the users with the highest number of followers, we analyze whether there are signals of these narratives in their most recent posts (as of June 7, 2023) and which narratives they currently support.In echo chamber 4, there are about 1.7k unique news that have been shared about 7.3k times in total.First, we exclude the news with connection errors at the time of this analysis (1k shares) and those that have been shared less than 10 times.Then, we analyze the resulting news narratives, which amount to ∼ 3.3k shares and 146 unique URLs from 51 different domains.By classifying only these 146 news stories, we cover about ∼ 45% of the total URL traffic within echo chamber 4. Table 2 shows the narratives' distribution and their descriptions: the main 8 narratives are all against vaccination and government regulations.
Table 3 shows the narratives supported by the users in echo chamber 4 with the most followers, almost two years after data collection (June 7, 2023).Users hold extreme views on current controversial issues such as the war in Ukraine, migrants, and LGBT issues.Remarkably, conspiracy theories about vaccines are still present in their narratives.

Echo chambers, their role in the common discourse and the plague of misinformation
Figure 5 shows the flow of retweets within an echo chamber and between different echo chambers.
Node −1 represents all nodes in the DiCo that are not part of an echo chamber, and an arrow indicates that tweets published by the source group are retweeted by a certain number of users in the target group.Self-loops represent retweet activity within the same group.The values on the edges indicate the number of retweets associated with each interaction.Although the echo chambers are composed of a small number of users (on the order of 10 2 , compared to the total number of DiCo users, on the order of 10 4 ), they contribute significantly to the DiCo's retweet activity.Echo chambers are involved in generating about 288k retweets, while users not in echo chambers generate about 569k retweets.More specifically, echo chambers 2 and 3 are mainly composed of popular users (in terms of received retweets), while others are mainly composed of retweeting users (0, 1, 4).To quantify the presence of misinformation in echo chambers, we have tagged URLs in our dataset that point to news sites.The labels are those that the NewsGuard journalistic organization has assigned to online media outlets1 .Use of the labels has been licensed to the authors of this article.More details about the reputability measure implemented in the present manuscript can be found in Section 4 of the Supplementary Information.
Figure 6 shows the number of URLs pointing to news from publishers that News-Guard classifies as 'Trustworthy' (T), 'Not Trustworthy' (N), and 'Unclassified' (UNC) for the entire dataset and for each type of user community.If the same URL is shared multiple times by users in the same group, this multiplicity is taken into account in the analysis.
The first observation is that the differences between user NECs and echo chambers are negligible.Second, DiCos cover almost the entire volume of both T and N traffic.Remarkably, while the ratio between untrusted and trusted URLs is around 0.5 for the entire dataset, the ratio is almost reversed for echo chambers: the frequency of N news sources is almost twice that of T news sources.
Section 5 of the Supplementary Information will show even more alarming results regarding the spread of disinformation in echo chambers.We do not show these results in the main text due to space limitations, but i) the probability that a link shared by a user in an echo chamber refers to an untrustworthy news source is 0.377, compared to 0.129 for users outside echo chambers; ii) the probability that a link shared by a user in an echo chamber refers to a trustworthy news source is 0.232, compared to 0.379 for users outside echo chambers.

Conclusion
In this paper, we propose a novel unbiased method to detect echo chambers.The method is mainly based on two observations.First, echo chambers form when users interact with others who share similar opinions and refer to the same news.Second, a proper null model should be implemented to detect a true signal.This necessity has recently been highlighted in the literature on online social networks and has been shown to be particularly important for the detection of non-trivial phenomena [6,7,9,[17][18][19].Our echo chamber detection method is based on the validation of observed structures by comparison with a proper maximum entropy null model; the maximization of entropy guarantees the unbiased nature of the benchmark.
We tested our procedure on a dataset containing the Italian Twitter debate on Covid-19 vaccination: we found that our procedure detects a low presence of echo chambers (just under 0.35% of all users in our dataset belong to an echo chamber).All the echo chambers we detected are part of the same discursive community, i.e. a community of users with similar political positions.Even if their dimension in terms of the number of users is limited, their impact on the shared discourse is remarkable: echo chambers are responsible for almost a third of the retweets in their discursive communities.
The methodology can be extended to other online social networks.In fact, it is based on i) the analysis of the activity of accounts that share URLs to news sources and ii) the detection of discursive communities.While the extension of the former to other online social networks is straightforward, the latter may be more problematic: in the present case, we used the activity of verified users, who are among the main content creators in Twitter [6], but not all social platforms have such certification.Nevertheless, when analyzing other platforms, we can still focus on users who are particularly active in creating new content, such as influential users as defined in [20].
Not unlike other studies, our study has some limitations, which we believe do not affect our final conclusions.First, it may be argued that the validation procedure is quite strict: the validation of multiple p-values leads to the validation of extreme events.While this is true, it is the only way to eliminate random noise from the system and analyze the true signal (see Sections 7 in the Supplementary Information for more details).Finally, the main idea of echo chambers is that users follow accounts with similar ideas, while in the present study only the retweet network is used, not the information about friendships.Still, the retweet network captures the effective interactions with interesting content as perceived by different users, whether it comes from friends or is suggested by the platform itself: focusing only on friendship will not fully capture the effect of the platform's recommendation algorithm.

Network analysis methods
Recently, De Clerck et al. stressed the importance of using proper statistical benchmarks for the analyses of Online Social Networks [18,19]: in fact, such systems are affected by strong noise and detecting genuine signals is fundamental in order to drive the proper conclusions.In fact, our procedure for the detection of echo chambers is based on the statistical validation of different co-occurrence networks.Co-occurrences are implicitly based on a bipartite structure: if we count, for instance, the number of URLs that have been shared by both a pair of users, we are implicitly projecting the information contained in a bipartite network in which layers represent users and URLs on the layer of users.Therefore, including the bipartite information in the analysis of the observed co-occurrences provides a more accurate benchmark.
A general framework for providing unbiased benchmarks for the analysis of complex networks was recently proposed in the literature [5], inspired by the derivation of Statistical Physics from Information Theory by Jaynes [21].The main idea is to first create an ensemble of all graphs having the same number of nodes as in real systems.We can then define the Shannon entropy associated with the ensemble: in order to have a maximally random benchmark, we maximize the Shannon entropy, constraining some defining quantities about the system.In this sense, by comparing the real network with our null model, all observations that cannot be explained by the constraints can be captured.Constraints can be global, as the total number of links, or local, as the degree sequence, i.e. the number of connections per node.
In the following, we will first introduce the Bipartite Configuration Model (BiCM, [22]), i.e. the application of the procedure described above to bipartite networks in which the degree sequences are the constraints.Then we will describe the validation procedure for co-occurrences, proposed in Ref. [23].Both the BiCM and the validation procedure used in the present manuscript were performed using the bicm2 python module included in NEMtropy3 ; the methods used to solve BiCM system of equations implemented in NEMtropy and bicm can be found in Ref. [24].

Formalism
In a bipartite network, nodes are divided into two sets, called layers and links exist only between nodes belonging to different layers.Given a bipartite network G Bi , let us call its layers ⊤ and ⊥, respectively, and N ⊤ and N ⊥ their dimensions.Then, a bipartite binary network is completely described by its biadjacency matrix B, i.e. a N ⊤ × N ⊥ rectangular matrix whose generic entry b iα is either 1 or 0 if there exists a link connecting node i ∈ ⊤ and α ∈ ⊥ or not.The degree of a generic node i ∈ ⊤ (α ∈ ⊥) is simply k i = α∈⊥ b iα (h α = i∈⊤ b iα ).In the following, quantities related to real networks will be indicated with an asterisk * .

BiCM
Let us call G Bi the ensemble of graphs of the Bipartite Configuration Model in which each representative graph G Bi ∈ G Bi is a N * ⊤ × N * ⊥ bipartite network 4 .We define the Shannon entropy associated with the system as We can perform a constrained maximization of the Shannon entropy using the methods of Lagrangian multipliers, the constraints being the degree sequences of both layers, i.e. k i , ∀i ∈ ⊤, and h α , ∀α ∈ ⊥.In this way, we will achieve a benchmark that is maximally random, but, in which the average degree sequences are equal to the ones observed in the real system.Therefore, by observing deviations from the null model we will detect all structures of the real system that cannot be simply explained by the constraints.Such a procedure can be achieved through the maximization of the function S ′ defined as where S is the Shannon entropy defined in Eq. 1 and β, θ i and η α are the Lagrangian multipliers associated, respectively, to the normalization of the probability, to the degree sequence on layer ⊤ and to the degree sequence on layer ⊥.The maximization of S ′ returns a probability per graph that can be written in terms of independent probabilities per link [25]: Eq. 2 is just formal since we do not know the numerical value of Lagrangian multipliers θ i and η α .This can be obtained through the maximization of the likelihood of observing the real system [26,27].It can be shown that maximizing the likelihood is equivalent to set: In Section 6 of the Supplementary Information, the interested reader can find a detailed description of how to use the Bipartite Configuration Model as a statistical benchmark to validate the co-occurrences observed in the real network.

Discursive communities
As stated above and described in detail in Section 6 of Supplementary Information, the BiCM described above can be used as a statistical benchmark to highlight groups of users contributing to the formation of the same discourse.On Twitter, this translates to groups of users endorsing similar content.In Ref. [6] a procedure was proposed, later refined in Ref.s [8,9,17].The rationale is to consider who are the creators of content and how to capture similarities among them.It has been observed in several studies that verified accounts, i.e. the ones for which the Twitter platform checked the identity of their owners -at least in the pre-Musk era-are strong creators of content [6,17].It is possible, then, to infer how similar they are perceived by the "general" public of unverified users by using a bipartite representation: if verified and unverified users are the two layers of a bipartite network in which the (undirected) links represent retweets 5 , we can validate the projection on the layer of verified users.In this way, we will detect non-trivial similarities in the common audience of unverified users: otherwise stated, if a couple of verified users are retweeted by the same (non-verified) users, they are probably sharing similar positions.In the monopartite validated projection of verified users, communities were detected using an optimized version of Louvain algorithm [15]: since Louvain is known to be node-order dependent [28], the order of the nodes is shuffled 1000 and the configuration displaying the greatest value of the modularity is chosen.The labels of the communities found through the community detection are then propagated in the retweet network: in fact, it is an old result that Twitter users endorse content created by others much more with retweets than with mentions [10][11][12].Since in many cases, some strong creators of content are not verified (and therefore run the risk of not getting a label), we run the label propagation algorithm on the undirected version of the retweet network: the rationale is that not only the sources give an indication of the user orientation, but also her audience.Otherwise stated, if the majority of a user audience has a clear orientation, it is presumable that the considered user also has the same one.For propagating labels, we implemented the procedure proposed in Ref. [13].This algorithm assigns the unverified user the label associated with the majority of its neighbors in the retweet network.If an unverified user has no verified users as direct neighbors, it will be assigned the label associated with the majority of unverified neighbors that have already been labeled.This continues iteratively until it converges.
The interested reader can find in Section 7 a comparison between DiCos obtained using validated or non-validated projections.In summary, it has been observed that politicians are particularly clustered in the validated network [6-9, 17, 29-31].Therefore, detecting community therein is particularly efficient in finding discursive communities about political subjects.On the contrary, the discursive communities calculated on the non-validated projection are much noisier.

URL manipulation
To detect similarities in users' endorsement of pieces of news, we first need to preprocess the URLs contained in the various tweets.Sharing a compact version of a URL allows for the sharing of long URLs in tweets while maintaining the maximum character limit.For our analyses, we translated all shortened links into their original long versions.This enabled us to (i) read the top-level domain of the news source to assign a nutrition label using NewsGuard and (ii) use the long links as unique identifiers for each shared news item in our network models.

NEC communities
In order to find users sharing similar "information diets", i.e. engaging with the same URLs, we used the same approach as in Ref. [14].We first represented users sharing (either via tweets or retweets) URLs as a bipartite network of users and URLs.Then, we projected the information contained therein on the layer of users and finally validated the projection using the procedure described above.As mentioned in Section 2.3, the fraction of validated nodes, in this case, is extremely limited, i.e. nearly 1.71%, signaling that most of the users' endorsement to URLs (and so pieces of news) is compatible with the random noise.Again, in order to find communities of users in the validated projection network, the reshuffled version of Louvain was used.

Funding
Work partially supported by project SERICS (PE00000014) under the NRRP MUR program funded by the EU -NGEU; by the Integrated Activity Project TOF-FEe (TOols for Fighting FakEs) https://toffee.imtlucca.it/;by the IIT-CNR funded Project re-DESIRE (DissEmination of ScIentific REsults 2.0).

Author contributions statement
All the authors designed the research, wrote and reviewed the manuscript.M. Pr. performed the research and analyzed data.

Previous presentation
Some of the results were presented at NetSci 2023, Wien, on July 14th, 2023.

Data availability
The Twitter dataset is available from the corresponding author upon reasonable request.The association between the trustworthiness labels and the news sources that supports the findings of this study is proprietary Newsguard data.

A Literature Review
The detection of echo chambers has been generally approached by the literature starting from online content whose nature is known a priori.Through the analysis of the social accounts that interact with specific content, e.g.via likes, shares, retweets, and comments, it has been shown how information relating to specific narratives attracts distinct communities.Work in [3], by Del Vicario et al., focuses on public Facebook pages divided into two groups: conspiracy theories and news about science (conspiracy theories are 'the pages that disseminate alternative, controversial information, often lacking supporting evidence' [3]).The findings are that users are divided into homogeneous clusters: by analysing the accounts that share news about science and conspiracies, they are bound by ties of friendship in the network.Quoting the authors: 'different contents generate different echo chambers, characterized by the high level of homogeneity inside them'.
Homogeneity is not only about friendship, but also about emotional approach and reaction to debunking attempts.Zollo et al.,in [32], establish how users polarised on conspiracies express more negative feelings in their comments than users polarised on science news.Work in [4] confirms how the echo chamber paradigm goes hand in hand with the confirmation bias phenomenon -the users' tendency to look for, prefer and interpret information in line with their thoughts [33,34], while ignoring or downplaying evidence that contradicts their beliefs: interactions with debunking posts (i.e., posts that provide fact-checked information to specific topics) are overwhelmingly from users biased towards science or non-biased users.
The above examples show how echo chambers emerge by analysing thematic pages and noting that users divide into distinct communities according to the page topic.Going deeper, it also emerges that consecutively sharing users are linked by friendship links on the network.
Interestingly for the purpose of this article, other studies have instead analysed the dynamics of information exposure by considering the news URLs present in the posts.This is the case, e.g., of work by Weaver et al. [35], in which the network of denselyconnected news articles is constructed.It starts from the number of news URLs shared by each user, to arrive at the weighted network of news URLs in which the weights between two URLs identify how many users have re-shared the URL pair.Leveraging a state-of-the-art community detection algorithm, communities of co-shared news items are found, distinct in terms of political leaning (i.e., left-leaning and right-leaning).Guarino et al.,in [14], consider public Facebook pages, without however knowing a priori the kind/quality/reputability of their content.Focusing on the activity of users sharing links to pieces of online news, the authors construct the bipartite network of users/shared URLs and apply the Bipartite Configuration Model (BiCM) introduced in [23] to project the bipartite network on the two levels, the user level and the URL level.Applying the BiCM assures that two accounts (resp., two URLs) are connected if the number of URLs shared by both the accounts (resp., if the number of accounts sharing both the URLs) is so large that it cannot be explained by the degree distribution of the two layers only.

Keywords
English meaning vax, vaccino, vaccini, vaccinarsi Variants of the word 'vaccination' novax A person against vaccination Astrazeneca, Pfizer-BioNTech, Moderna, Sputnik Covid-19 vaccines greenpass The certificate of vaccination or of recover from the disease

C News Engagement Communities of URLs
Similar to what was done in the main text to find validated communities of users, it is possible to analyze the ties between users and the URLs present in their tweets and retweets to find validated communities of URLs.Again, the procedure involves a comparison between the observations and an entropy-based benchmark: if two URLs appear in the tweets (or retweets) of the same users significantly more than the benchmark, these URLs pass the validation procedure: We can thus identify groups of URLs shared by the same users.URL communities that pass the validation are called news engagement communities of URLs, Table 5 summarizes the breakdown of URLs into URL NECs: Only 22% of all URLs are validated by our procedure.More details can be found in Table 6, which shows some information about the different URL NECs.URL NEC 4 is the largest in terms of both size (consisting of 223 nodes) and impact on the overall dataset, as measured by the number of shares (∼ 39k shares).The remaining URL NECs can be distinguished based on the order of magnitude of the shares: we have 6 communities whose URLs were shared thousands of times, and other communities whose URLs were shared hundreds or dozens of times.Remarkably, in all but 4 of the URL NECs, the number of different sources is quite limited (where source means the online news outlet that published the news to which the URL points).
To get a finer description of URL NECs, we examine the frequency of untrustworthy news sources in them.For each URL pointing to a news article, we consider the corresponding second-level domain 6 , which refers to the name directly to the left of .com,.net,and other top-level domains (such as nytimes.comand latimes.com).We then associate the domains with the publishers, annotating the former with the reputation labels provided for the latter by the NewsGuard site (https://www.newsguardtech.com/).In this sense, the trustworthiness of a URL is inherited from the trustworthiness of its domain/publisher, i.e. a news item is considered more or less trustworthy depending on the trustworthiness of its publisher.According to the NewsGuard classification, the labels T ('Trustworthy'), N ('Not trustworthy') and UNC ('Unclassified') stand for the level of trustworthiness of the publisher.For more details on how the information is processed by NewsGuard, see Section D.
The first observation is that URL NECs are a receptacle of untrustworthy sources, see Fig. 7.With respect to the total number of distinct URLs in our dataset, URL NECs capture less than half of the trustworthy ones, but almost all of the untrustworthy ones.
Fig. 8 pictorially shows the network of URL NECs, as it emerges from the data.The different communities show a strong homogeneity in the trustworthiness of their sources.
To investigate more deeply the level of homogeneity of the single community in terms of the trustworthiness label of URLs within them, we consider the frequency of trustworthy and untrustworthy sources of URLs therein.For the i-community of URL NECs, if R is the trustworthiness value (either T or N ), we define purity R (URL NEC i )  the frequency of URLs from R domains, i.e.
where U i = {U RL 1 , . . ., U RL n } is the set of all the URLs in the i-community and U R i ⊆ U i is the subset of U i that contains only URLs with trustworthiness R. The purity defined in Eq. 3 can be interpreted as the probability of extracting an Rreputable URL in the i-th URL NEC.If m is the number of different URL NECs, we can define purity R (∪ i URL NEC i ) as the frequency of URLs from R domains in all URL NECs: To have a benchmark for the purity of URL NECs, we also consider a purity measure for URLs that do not belong to any community: where the set of URLs that do not belong to any community is denoted as U −1 .Fig. 9 shows the homogeneity of URL NEC communities concerning trustworthy (T, left panel) and untrustworthy (N, right panel) news sources.On the x-axis there are the URL NEC communities, denoted by their ids, while the y-axis reports the purity of each community.The blue dotted line indicates purity R (∪ i URL NEC i ), the black dotted line indicates purity R (∪ i URL NEC i ).Focusing on the purity R (∪ i URL NEC i ) lines, on average URL NECs have higher N purity URLs (∼ 0.548) compared to T URLs (0.254).Such a result suggests that URLs belonging to URL NECs represent niches of misinformation sources, and it is corroborated by the observation that most of URL NECs refer to a limited number of different sources (see Table 6).

D Article's reputability measure (NewsGuard)
One of the aims of the work is to characterize the variety of domains circulating within the dataset, both in terms of type (e.g., news site, marketplace, social platform, etc.) and transparency and credibility (only in the case of news sites).In this paper, we refer to domains as the 'second-level domain' names7 , i.e., the names directly to the left of .com,.net,and any other top-level domains.For instance, we consider domains nytimes.com, guardian.com,corriere.it.
The domains have been tagged according to their degree of credibility and transparency, as indicated by fact-checking website NewsGuard (https://www.newsguardtech.com/).The NewsGuard initiative was born from the joint effort of journalists and software developers, aiming at evaluating news sites according to criteria concerning credibility and transparency.For evaluating the credibility level of a source of information, NewsGuard metrics consider, e.g., whether the news source regularly publishes false news, whether it distinguishes between facts and opinions, or whether it does not correct a wrongly reported news.For transparency, instead, NewsGuard evaluation takes into account, e.g., whether owners, founders or authors of the news source are publicly known, or whether advertisements are easily recognizable Table 7 shows the tags associated with domains.In the manuscript we shall be interested in quantifying the reliability of news sources that were publishing during the period of interest.Thus, we will not consider those sources corresponding to social networks (tag P).Also, we will not consider satiric news (tag S).Tags T and N in Table 7 are used only for news sites, be they newspapers, magazines, TV or radio social channels, and they stand for 'Trustworthy' and 'Not trustworthy', respectively.

E Exposure of users to misinformation in echo chambers
To provide a finer characterization of users' exposure to misinformation in echo chambers, we 'recycle' the purity definition of Section C, with one crucial difference: there, the purity measure was applied to different sets of URLs from time to time; here, we apply it to all messages shared by different sets of users.Thus, in the present case, if a URL has been shared multiple times, we consider the repetitions.The rationale for this is to characterize echo chambers in terms of the extent to which links to news stories from untrustworthy news publishers circulate within them.If |EC i (URL)| and |EC i (URL; R)| count, respectively, the number of messages containing a URL and a R-reputable URL shared by users in echo chamber i, with a little abuse of notation we can define a purity for echo chamber as Analogously to what was done in Subsection C, we can purity R (∪ i EC i ) and purity R (∪ i EC i ), respectively for all users in echo chambers and for all users outside echo chambers.The results of the analysis is reported in Fig. 10: on the x-axis there are the echo chambers denoted by their ids, on the y-axis the purity values.On the left panel, purities are related to trustworthy URLs.On the right panel, purities are related to untrustworthy URLs.The blue dotted line indicates purity R (∪ i EC i ), the black dotted line indicates purity R (∪ i EC i ).
Focusing on the purity R (∪ i EC i ) lines, echo chambers on average have a higher purity with respect to untrustworthy URLs (∼ 0.377) compared to trustworthy ones (∼ 0.232).In other words, when a user posts a message containing a URL in an echo chamber, the probability that it points to an untrustworthy news source is close to 0.4; for some echo chambers, this probability is even much higher than this.As in the case of the purity for URL NECs, if we compare the purity R (∪ i EC i ) values against purity R (∪ i EC i ), there is a trend reversal in passing from T to N: the purity T (∪ i EC i ) value is greater than its counterpart in the echo chamber while purity N (∪ i EC i ) is lower than the value measured in echo chambers.This finding is worrisome because users in echo chambers are particularly polarized and committed, basing their beliefs on low-quality news.However, it is important to remember that the formation of echo chambers, while alarming in itself, is generally unrelated to the quality of news sources.

F Validated projection of bipartite networks
The BiCM null model introduced in Subsection 4.1.2 of the main text can be used to validate the co-occurrence network defined from a bipartite one.Consider two nodes i, j ∈ ⊤: the number of co-occurrences between them is As mentioned in the subsection above, the probability of observing a graph G Bi is factorised in terms of probabilities of the existence of a single link.Therefore the probability that both nodes i, j link a single node α ∈ ⊥ is simply where V ij α is defined in Eq. 7. In general, given node i ∈ ⊤, all p iα are different, depending on the degree h α .In this sense, the BiCM probability distribution of V ij is the generalization of the binomial distribution in which each event V ij α has a different probability.Such a distribution is known in the literature with the name of Poisson-Binomial distribution.For each observed co-occurrence, we can then calculate its p-value [23].
Finally, all p-values are validated using a multiple-test hypothesis.In the present work, we use FDR [36], since it permits to control the number of False Positives.In a nutshell, the FDR procedure prescribes ordering all p-value from the lowest to greatest, i.e. p-value 1 ≤ p-value 2 ≤ • • • ≤ p-value n .Then, if n is the total number of tests, the effective threshold is given by the greatest i satisfying p-value i ≤ i α n , where α is the statistically significant threshold.In the present analysis α = 0.05.

G Validated vs non-validated discursive communities
Let us summarize the procedure for inferring the presence of discursive communities (DiCo) in our dataset, as described in the main text.Our approach focuses on the bipartite network of verified vs. unverified accounts, where a link represents the presence of at least one retweet from the unverified to the verified user.The network is then projected into the layer of verified users, resulting in a monopartite network in which the weights of the link represent the number of common (unverified) retweeters, i.e. the co-occurrences.Finally, the network is validated by comparing the empirical values with a maximum entropy null model (the BiCM [22]), including the information of the bipartite degree sequences.At first glance, the validation procedure may seem like an unnecessary complication.The goal of the analysis is to extract similarities in the creation of new content based on common audiences, and it can be argued that even without extracting the significant structure of the network, the standard algorithms for community detection can find the relevant network structure.
Before directly comparing the results in the case of our dataset, let us first provide a methodological argument in favor of using the validated projection instead of the entire projection network.As mentioned above, the output of the procedure is a monopartite network in which connections are present if the co-occurrences cannot be explained by the bipartite degree sequences.In this sense, the structure of the network is inferred by discounting the original bipartite information.If, instead, the projection network is not validated, the communities in the network are inferred using the information about the projected network, i.e., some kind of information derived from the original bipartite system.Note also that knowing the value of the co-occurrences does not allow going back to the bipartite structure of the system and causes a loss of information [37].In this sense, the use of the original information available from the data should be preferred.
Nevertheless, the implications of such a choice could still be limited in our dataset and, therefore, we will examine the results of the different approaches.The first observation, already highlighted in many papers [6-9, 17, 29-31], is that, when the debate is political or societal (as in the case of our dataset), the accounts of politicians and political parties tend to cluster, according to their orientation, in the validated network of verified users.This is also the case for our dataset, as can be seen in the left panel of Fig. 11: the colored nodes represent the accounts of political parties and politicians, where the color is related to their political alliance 9 .The only exceptions are some Italia Viva accounts that are merged with some center-left politicians.Such behavior is justified by the fact that Italia Viva was created by politicians who left the PD because they were not satisfied with the current leadership.In this sense, it is not surprising to find links between former party members.
The Louvain algorithm, run on the validated projection, captures such groups, see the top right panel of Fig. 11 (nodes displaying the same colors belong to the same community).
Even if running the (weighted) Louvain algorithm on the entire co-occurrence network yields, by definition, different results, they could still provide a coherent partition of the validated projection, since it represents the core of the co-occurrence network.Remarkably, discounting inferred information has a cost: the obtained partition is less coherent with the political orientations of the verified users than the former one, see the lower right panel of Fig. 11.For example, Movimento 5 Stelle and the center-left alliance are mixed.The situation is even worse for Italia Viva, which is split in 2, partly joining the center-left alliance accounts and partly mixed with Forza Italia.In this sense, we can say that the community detection on the validated projection gives cleaner partitions than those calculated on the non-validated network.Finally, comparing modularities computed on different types of networks is not particularly informative, but it can still give a rule-of-thumb idea about the organization of the network: in the case of the validated network, the modularity is Q ≃ 0.66, while in the case of the non-validated network, it is Q ≃ 0.17 10 .In this sense, the validated network has a more modular structure.In summary, in the validated projection of verified users, politicians and political parties cluster according to their political affiliation, and therefore a community detection algorithm running on the validated projection will capture these groups.Instead, a community detection algorithm running on the entire co-occurrence network of verified users, where co-occurrences is the number of common unverified retweeters, adds some noise to the partition found, and the division between opposing groups is less clean.

Fig. 1
Fig. 1 Pipeline for Echo-chamber detection.The upper path focuses on the detection of Discursive Communities (DiCo), while the lower one on the detection of News Engagement Communities (NEC ).Both procedures pass through the statistical validation of empirical data with an entropybased null model.

Fig. 2
Fig. 2 Characterization of the main DiCos in terms of the number of users, tweets, and retweets.Charts at the bottom only consider tweets and retweets that contain URLs.

Fig. 3
Fig. 3 Left: Network representation of user NECs.Right, top: percentage (and number) of user NEC users belonging to each group.Right, bottom: Percentage (and number) of URLs disseminated by users belonging to the various user NECs.

Fig. 4
Fig. 4 Left: average clustering coefficient measured on the LWCC of the retweet network restricted to users of FdI-L-Media and measured on all users belonging to echo chambers.Right: average clustering coefficient calculated on each echo chamber.Each echo chamber inherits the ID and the color from its user NEC.The number of users in the echo chamber is shown at the top of each bar.

Fig. 5
Fig. 5 Retweet network for FdI-L-Media DiCo, aggregated with respect to echo chambers.Node −1 represents users who do not belong to an echo chamber.Edges indicate the number of retweets between different user groups; weights less than 1k have been filtered out.

Fig. 6
Fig. 6 Number of distinct URLs pointing to news publishers tagged as 'Trustworthy' (T), 'Not trustworthy' (N), or 'Unclassified' (UNC) for the entire dataset and for each type of users' community (DiCos, user NECs, echo chambers.)

Fig. 8
Fig. 8 Network representation of URL NECs.The labels on the nodes represent the trustworthiness of the domain of the URL as labeled by NewsGuard (T for 'Trustworthy', N for 'Not trustworthy', UNC for 'Unclassified' sources).Each community shows a strong homogeneity in the trustworthiness label.

Fig. 9
Fig. 9 Purity levels of URL NECs.On the left trustworthy URLs, on the right untrustworthy ones.While the T purities of the individual communities are particularly low (less than 0.2 in most cases), the analog N purities are greater than 0.6 for most of the URL NECs.

Fig. 10
Fig.10Purity levels of echo chambers.On the left trustworthy URLs, on the right not trustworthy URLs.While the purity T (∪ i EC i ) value is greater than its counterpart in the echo chamber, purity N (∪ i EC i ) is lower than the value measured in echo chambers.

Fig. 11
Fig.11Comparison between the results of different community detections on the validated network of verified users.On the left, only politicians' accounts are colored according to their political affiliation (other verified accounts are gray).The first observation is that politicians with similar orientations cluster together in the validated projection.In this sense, a community detection run on this network returns partitions that are coherent with these political clusters (top right panel; nodes with the same color belong to the same community).The same is not quite true for a community detection algorithm run on the non-validated projection: in the latter case, the partitions only partially capture the political orientations present (lower right panel; again, nodes with the same color belong to the same community).
In our case study, two main discursive communities emerge, associated with political parties and Italian newspapers.Specifically, most of the users who are part of a DiCo belong either to the ItaV-PD-Media community (∼ 34.7%; the community includes journalists and exponents of the Italian parties Italia Viva and Democratic Party) or to the FdI-L-Media community (∼ 26.6%; the community includes journalists and exponents of the Italian parties Fratelli D'Italia and Lega).About 2.1% of users belong to smaller DiCos, while ∼ 36.7% of users do not belong to any DiCo.The FdI-L-Media community posted the most new content (64.3%), although it represents about a quarter of all users in our dataset.The ItaV-PD-Media community is responsible for 19.7% of the new content, while the remaining 15.5% is posted by users who do not belong to any particular community.In terms of retweets, FdI-L-Media is by far the most active community with 77.6% of the retweets.Fig.2(bottom)characterizes DiCos by focusing only on posts containing URLs.In general, the observations made for the top doughnut charts still hold, with the exception that almost half of the users who post tweets with URLs belong to the FdI-L-Media community (48.7%).

Table 1
Users in user NEC.

Table 2
Narratives' descriptions.Echo chamber 4. Police forces were not vaccinated.Support to views of no-vax doctors.VIPs and high-ups pretend to be vaccinated, but actually are not

Table 3
Main narratives supported in recent posts (as of June 7, 2023) by users in echo chamber 4 with the most followers.Users are anonymized.
, no-vax, no-LGBT, against the Italian government, anti-EU user 9 2355 religious posts, no-green pass, no-vax user 10 2316 against Italian government, no-vax

Table 4
Keywords used for collecting tweets about the Twitter debate on the Covid-19 vaccination campaign.Keywords have been searched in Italian, English meanings on the right.

Table 6
Statistics for URL NECs.While community 4 is by far the largest, there are 6 other communities with more than 1000 URLs.

Table 7
8. Tags for domain labelling.Tags are inherited from NewsGuard.The UNC tag indicates that NewsGuard did not tag that domain.