Analyzing Article Sharing by User Groups and Individuals

Welcome to the 1nd installment to my "Unraveling Coordinated Behaviour" series. Here I will deep dive into one of the projects I have worked on in SimPPL.

Introduction

Large language models can produce text that is not only more coherent and cheaper to produce, but is also able to persuade humans better than human-generated content. As more people grow aware of the unprecedented ability to cheaply generate highly realistic text and multimedia content, the trust in content posted to the internet is reducing. It is likely to make the job of fact-checkers incredibly challenging–if not impossible–to identify the differences between real and artificially generated content. This poses a great risk during volatile events like conflicts and civic crises where there is a massive influx of information in very short intervals of time and it is impossible to verify every narrative before it achieves viral spread through social networks. Historically, websites like Wikipedia and Iffy News have maintained lists of reliable and unreliable media providers as an effort to introduce some ‘credible’ perspectives into the discourse. These reliability scores are typically determined by evaluating the historical sharing of information by a media organization, albeit suffering from the obvious criticism of “who fact-checks the fact-checkers”, effectively questioning any single authority as an arbiter of truth. In order to sidestep this debate, we take the case of two so-called ‘unreliable’ media providers and assess the nature of information spread that they participate in on a popular microblogging platform. We try to answer the question of “how” their posts are spread which complements the question fact-checkers tend to ask which is “what” information are they publishing.

The flow of information on social media and the role of influential actors in amplifying content are crucial topics to investigate for platforms and researchers, even as we do not make use of reliability scores and other credibility-focused metrics that are based on the content alone. For example, our detection of bot-like behavior is essential for understanding the authenticity of online interactions pertaining to an outlet. While we emphasize that these need not be behaviors that an outlet is driving themselves, we find that they are certainly useful signals as they are correlated with what we believe to be coordinated inauthentic behavior. We provide this analysis as a general method to analyze the nature of spread of information on social media for any media organization, taking the case of two state-backed media providers in Russia, given the need to study the narratives relating to the invasion of Ukraine. To advance the interests of civic integrity, the development of open-access tools for analyzing coordinated activity is critical. These tools, including our platform, will aid in identifying patterns, trends, and impact of coordinated actors, enabling informed decision-making and safeguarding the integrity of public discourse.

Executive Summary

This analysis of networks of coordinated accounts lays the groundwork for analyzing engagement patterns on social networks.
It involves the detection of coordinated activity by groups of users sharing the same content in multiple instances with the sharing occurring within very short intervals of time. For example, we consider two accounts A and B as coordinating if A shares the same content in the same second as B, eventually relaxing this definition to include a 5-second delay between sharing by A and B, but considering multiple (>10) instances where A and B shared the same content within this 5-second interval.
The coordinated activity is presented from three different perspectives:
- Quantifying User Participation in groups, where the definition of a group is based on proximal sharing (0 to 5 seconds delay) of the same content by multiple participants.
- Analysing anomalies in user engagements in groups and identifying potential causal links with on-ground events during that period.
- Users Individually, summing coordinated activity instances(tweets) across their groups
Identification of consistent users indiscriminately sharing contents of select sources to amplify its reach and disseminate information to influence public discourse in favor of one’s own side
Identification of users consistently sharing the same content to influence public opinion, possibly due to strong political alignment to or belief in a narrative.

About our Dataset

Types of Tweets

There are 3 “response” types of engagement actions available to a user on Twitter: Retweeted, Replied To, and Quoted.
I have further split them into the following categories for easier analysis:
- Deleted Source: Tweets with the origin author account deactivated on Twitter.
- Unknown Source: Engagements on tweets whose origin author is active on Twitter but not in our dataset.
- Self-Retweeted, Self-Replied To & Self Quoted: Tweets types where the origin author is the same as the current author.
We can notice that ~57% of the tweets are retweets of other active accounts, closely followed by engagements on posts by deactivated accounts at ~31%.
Possible causes for an account being deactivated:
- User-initiated deactivation: Could be an attempt to hide or cover up the dissemination of information by the profile owner
- Violation of Twitter's terms of service: The account may have engaged in activities that violate Twitter's rules and policies, such as spamming, harassment, hate speech, or sharing inappropriate content.
- Inactivity: If an account remains inactive for an extended period of time, Twitter may deactivate it as part of their routine cleanup process.
- Security concerns: In some cases, Twitter may deactivate an account due to security concerns, such as a potential compromise or suspicious activity.
It raises questions about the reliability and trustworthiness of the content shared by the deactivated accounts.
We have a very small percentage of Self Retweets (~3.5k).

Media Sources of Urls

We have collected tweets containing urls from 3 Media Sources: Russia Today, Sputnik and The Ministry of Foreign Affairs (MID) of the Russian Federation.
Our dataset contains tweets with ~7030 unique urls.
~61% of the engagements contain RT article Urls, closely followed by Sputnik Urls at ~38%.
69 Urls are of MID.
User 8224 and User 19702 are 2 of the 3 media sources’ Twitter handles, and let’s call them origin authors, because they are the authors for the articles shared in our dataset.

Coordinated Activity: Engagement at the same second (strictest condition)

First, considering the strictest condition of Coordinated Activity: i.e. Group of Users sharing same content (RT & Sputnik urls) at same exact second over multiple instances.
Let's consider User Group A1 : (42190, 68965) & User Group A2 : (15806, 8224).
With 42 occurrences of coordinated sharing, Group A1 leads the field, much outpacing the next-highest group, i.e. Group A2 which only had 10 instances.
Such significant values even for the strictest condition possibly indicate a deliberate effort or organized campaign where users are synchronizing their actions to artificially amplify the impact of the shared content, in this case, some RT and Sputnik urls.
It is interesting to note that 848 groups comprise of Deleted users in them.
The exact cause for account deactivation cannot be definitively determined from the available information.
If majority of these are User-initiated deactivations, in that case such a high number is suggestive of efforts to cover tracks and evade detection, signaling attempts to stifle online discourse by dissemination of information as part of an organised campaign.
Otherwise if we consider the case where majority of deactivations were carried out by Twitter, it would indicate that the platform has been actively working on eliminating suspicious accounts engaging in coordinated activity.

Natural Next directions to explore

There are 3 things we need to look into:

To spot the timelines of heightened coordinated activity & possibly identify their causation.
To further analyse coordinated activity participation by aligned users across multiple groups.
To expand the timespan that defines coordinated activity.

Timeline of Coordinated Activity by Top 5 Pairs with On-Ground Events

This time series plot represents same-second coordinated tweets between pairs of users, with each dot indicating a specific instance.
The alignment of tweets from multiple user pairs indicates the potential occurrence of large-scale synchronized activity involving multiple user groups.
Note: In certain instances, line traces representing specific on-ground activities sourced from this Wikipedia page, have been included to provide further insights.

First, Examining the Timeline of the Russia's "special military operation" in Ukraine as a possible cause for Coordinated Activity surges involving multiple user groups.
Observations reveal that certain instances(times) exhibit the convergence of tweets from multiple user pairs, suggesting the potential existence of extensive synchronized campaigns orchestrated by Amplification Networks.
These endeavors suggest an aim to shape public narratives and influence discourse surrounding particular events and news cycles, aligning with their own perspectives and objectives.
The tweets during these times are reflective of the main events happening at that time, with the potential of affecting on ground mobilisation of troops and preparedness for conflict by both sides.
Significant Political events during heightened activity are:
- on 7th December, 2021, when President Biden warned President Putin of "strong economic and other measures" if Russia attacks Ukraine.
- on 17th January, 2022, when Russian troops begin arriving in Russia's ally Belarus, ostensibly for "military exercises".
- on 17th February, 2022, when fighting escalates in separatist regions of eastern Ukraine.
- on 25th March, 2022, when Russian forces launched an airstrike against a Ukrainian Air Force National Military Command Centre located in Vinnytsia, Ukraine.

Coordinated Activity (engagements at the same second) by Users Across Multiple Groups

Exploring the Second direction, i.e. to consider coordinated activity participation [tweets of the same content (RT and Sputnik urls) at the same time] by aligned users across multiple groups.
Non-origin users (which are not the source of these urls) from Group A1 i.e. 68965 and 42190 consistently lead in this field.
Additionally, this analysis sheds light on the discovery of new users who actively engage in coordinated activity with multiple groups.
Notably, User 27148 stands out as a compelling case, displaying a significant level of participation in same-second coordinated activity across multiple groups, surpassing even User 8224, despite not being the original author of the articles.
However, it is noteworthy that User 27148 did not demonstrate significant same-second engagement as part of a single user group.
This suggests User 27148 may be part of multiple interconnected networks that collaborate to amplify messages and possibly orchestrate campaigns.

Coordinated Activity: Engagement within 5 second interval

Orange: Top 5 User Groups (A1-A5) that engaged in same-second coordinated sharing at multiple instances
Blue: Top User Groups (B1-B3) exhibiting coordinated sharing within a 5-second interval, indicating emerging patterns of engagement.

Exploring the Third direction, i.e., to expand the timespan that defines coordinated activity, here considering 5 second window.
The analysis reveals that User Groups A1 and A2 display a significant increase in coordination within a 5-second interval, with 71 and 55 instances(tweets) of sharing of the same content (RT & Sputnik urls), respectively. This is a substantial rise in compared to the previous observations of 42 and 10 instances of same-second sharing.
These findings suggest a strong level of synchronization and alignment between these user groups in engaging with specific content.
Deleted user group B1 features in the top 5, which raises suspicions about the
- nature and intent of their actions &
- credibility and potential bias of the shared articles.

If we carefully observe, User 8224 appears in
- 5 out of the Top 8 groups and
- 7 out of the Top 25 groups,
This indicates that the content shared by User 8224 possibly has more contribution to coordinated activty than anticipated via the Strictest condition, which correlates to it being one of the origins of these urls.
Increasing the timespan to 5 seconds has helped as in the previous case when considering the strictest condition i.e. coordinated sharing at the same-second,
- Deleted user groups didn't feature in the top 5 in group-wise contribution.
- None of the Deleted user groups participated in same-second coordinated sharing more than once.
- Across multiple groups, Deleted users didn't feature in the top 10 aligned contributors as well.

Timeline of Coordinated Activity by User Groups with On-Ground Events

This time series plot represents 5-second coordinated tweets between pairs of users, with each dot indicating a specific instance.
The alignment of tweets from multiple user pairs indicates the potential occurrence of large-scale synchronized activity involving multiple user groups.
Note: In certain instances, line traces representing specific on-ground activities sourced from this Wikipedia page, have been included to provide further insights.

Again, Examining the Timeline of the Russian invasion of Ukraine as a possible cause for Coordinated Activity surges, but with the slightly relaxed 5-second time interval condition.
A noteworthy observation is that Group A1 consistently engages in coordinated sharing, with minimal instances of skipping a day, throughout the period from January 25th to April 1st.
Similarly, we can observe that Group A2 consistently engages in coordinated sharing, with only 2 instances of skipping a day, throughout the period from November 26 to December 13.
On average, Group A2 shared around 3-4 coordinated tweets with the same content (urls) each day.
Notably, there was a peak in Group A2’s sharing with 6 instances on December 8, which coincided with a significant event where President Biden issued a strong warning to President Putin regarding potential “strong economic and other measures” if Russia attacks Ukraine.
Notice that for some groups, coordinated activity is limited to a few days, eg a significant number of instances, i.e. 35% of coordinated activity by Group B1 is around March 25, 2022 [date of Airstrike on UAF station].
Interestingly, these instances of coordinated sharing were within a small timespan of 5 minutes on March 24 and 25.

Coordinated Activity (retweets at within 5 second interval) by Users across multiple groups

The Line graph indicates Number of followers on a 5k per unit scale.
To optimize computational efficiency, a time block approach is being employed to identify potential coordinated tweets among users within a 5-second interval, while acknowledging the possibility of some tweet data loss.

Listed count refers to the number of public lists on which a user's account has been included by other users.
Having a high listed count can indicate that the user has a strong presence, expertise, or influence in a particular field or topic within the social media community.

Interstingly, User 15806 was not only part of the 2nd highest contributing pair A2 with 8224, but also has the most number of followers amongst non-origin users (i.e. Users not part of the trio media sources of articles, i.e RT, Sputniknews, and MID).
Another observation we can make is that User 27148 jumped up 1 place to secure the 2nd position for coordinated sharing after the timespan window was expanded to 5 seconds.
Upon initial examination, User 27148 may appear to have a relatively insignificant impact on the social network compared to notable users such as User 15806 or User 8224, considering their respective follower counts (~3.5k followers, ~31k followers, ~356k followers).
However, it is worth noting that User 27148 possesses the highest listed count (exceeding 1200) among all users in our dataset.
The plot clearly reveals the crucial role of the content (articles) shared by User 8224, involved in coordinated sharing about 111 times, even for a small timespan of 5 seconds.
This is significant as User 8224 is a state-backed media outlet's handle, which suggests an intentional effort by users sharing the same content as 8224, to manipulate the information being disseminated to the public.
However, it can be argued that since User 8224 is often the source of the content it shares, it is natural to have at least 1 other user retweet its content within the 5 second interval, for 111 times which is a small percentage of its overall tweets.
But the following points also need to be considered:
- The previous graph clearly showed pattern of coordinated sharing of User 8224's content by few select users only,
  - 50% of it by just 5 Users,
  - Out of which, 37% by only User 15806.
- Out of the 111 times, 64 occur with groups with at least 2 instances of coordinated sharing within a 5 second window.
- What's more interesting is that out of these 64, there are 17 instances where the first tweet with the common url was not posted by SputnikInt or RT, but other general users.
- Of these 17 instances, 13 tweets comprised User 8224 authored articles, and the other 4 had User 19702’s articles.
- This finding indicates the occurrence of 17 instances in which a non-origin user disseminated news about an article on Twitter that it did not author, and within 5 seconds, a state-backed media outlet, which was the author of the articles in 13 instances, reposted the same content.

That's not all, apart from the url, the content (tweet text) shared is also more or less the same with minor tweaks.
Following are a few examples where User 8224 posted within 5 seconds of another user with similar text and same article link:

Example Instance 1

User 15806: UK to Deploy More Tanks to Germany Amid Reported Worries Over Russian 'Activities' on Ukraine Border
User 8224: #UK to Deploy More #Tanks to Germany Amid Reported Worries Over Russian 'Activities' on Ukraine Border

Example Instance 2

User 62687: Russian Ambassador to London Says Severing Relations With UK Is Possible
User 8224: #BREAKING | Russian ambassador to London says severing relations with UK is possible

Comparative Analysis of User Groups for different Time Intervals

The purpose of streamlining is to narrow down the analysis to specific user groups, specifically
- A1-A4, which exhibit multiple instances of same second sharing, and
- Groups B1-B3, characterized by a minimum of 12 instances of 5-second interval sharing.
By focusing on these 7 user groups, we can increase our level of certainty that they have indeed engaged in coordinated sharing.
Subsequently, our objective shifts to exploring the extent of their coordinated sharing within a 10-second interval.
Upon expanding the timespan to 10 seconds, our analysis reveals a notable increase in the detection of coordinated activity within User Groups B1, B2, and B3.
We see an unprecedented rise in the number of coordinated tweets shared by User Group B3 when expanding the time interval to 60 seconds.

Article Engagement Analysis

Nearly 200 urls or 3% of the urls received more than 300 engagements.
Conducting an in-depth analysis of coordinated activity for articles with the lowest levels of engagement would also provide valuable insights into user behavior and URL sharing patterns across the entire engagement spectrum.

Coordinated Activity (retweets at within 5 second interval) by Users for urls with the least amount of total interactions(tweets)

If there is presence of organised social campaigns, there will exist campaigns that didn't gain much traction as the content shared didn't resonate with the wider audiences.
The goal is to find users who indiscriminately share articles from Russian state-owned media outlets.
Out of the Top 5 users, the presence of our Top 2 User groups i.e. A1 & A2 reaffirms our findings and beliefs that these 3 non-origin users (15806, 65965 & 42190) are highly aligned to a common narrative.
Also User 15806's behavior of indiscriminately sharing all of User 8224's content, irrespective of its potential to have a widespread impact, may suggest:
- a deliberate strategy to amplify User 8224's content or
- User 15806 exhibits automated or bot-like behavior.

Few users repeatedly share the same article over long periods of time.
For instance, engagement analysis for the article "second us civil war" reveals that approximately one-fifth of the total engagements it received can be attributed to User 9532, demonstrating a consistent involvement that persisted for an impressive span of 18 months since the initial sharing of this article on Twitter potentially indicating a deep interest or connection with its content.
The analysis reveals an interesting scenario where the article "us unpredictability fuels conflicts" was shared at most twice by a single user, but the engagement with the article persisted for a remarkable duration of 18 months.
This suggests that the impact and resonance of the article were significant enough to keep users engaged and interested over an extended period of time, even with limited initial sharing.

Small cohort of users propel the engagement of articles with comparatively shorter durations

Above plot analysis the Tweet Activity for the article with title: “Ukraine’s nuclear threats undermine European security, experts tell RT”
The engagement from the general audience significantly declines, reaching a mere two tweets on the third day after the initial post and no tweets on the fourth day.
An intriguing observation is that Deleted User 34552 appears to be the primary driver of interactions for the article from February 24th, 2022, to February 28th, 2022, accounting for 44% of the total interactions for the article.
Notably, the repetitive sharing activitiy coincides with a significant event i.e. start of “special military operation” of Russia, suggesting a possible correlation between the event and the increased activity surrounding the article.

A similar pattern is observed for the article titled: West Turned Blind Eye to War Crimes by the Kiev Regime, 'Genocide' in Ukraine, Lavrov Says
Deleted User 48757 emerges as the sole driver of 50% of the interactions recorded for the article after the first day.
Observations from above 2 plots suggest possible dependence on these 2 deleted users for facilitating the engagement and amplification of these articles beyond the initial days of engagement.
The deactivation of these accounts, assuming it was initiated by the platform rather than the users themselves, signifies steps taken by Twitter to address and mitigate the presence of potentially inauthentic accounts on its platform.

The opposite trend is observed here where User 10952 drives initial engagement, which is nearly 62% fo the total engagement received by the article titled: “Russian troop buildup ‘largest since cold war’ – NATO”
But this article fails to sustain interest and engagement in the subsequent days with insignificant activity by the general audience on the third day from the original post.

Disclaimer

The analysis presented in this research was conducted by Jhagrut Lalwani during his affiliation with SimPPL from October 2022 to May 2023, and subsequently reviewed by the team.
This report presents the preliminary insights from our analysis of coordinated behavior in the sharing of articles from two state-backed media outlets with the intention of highlighting the sharing patterns we observed that boost certain narratives, and the nuanced behavior of groups of accounts that participated in promoting similar articles.
We originally planned to perform a similar analysis on articles from an expanded set of media sources to provide a broader perspective on coordinated sharing. However, due to the challenges in expanding this research on Twitter given rate limits and ultimately removal of academic API access from the platform, our ability to collect further data on coordinated networks sharing articles from other media outlets was severely restricted.

Acknowledgements

On behalf of SimPPL, I extend my gratitude to Ippen Digital (DE) and The Times and The Sunday Times (UK) for their partnership in creating Parrot Report.
I would like to express my thanks to FOCUS Data Project for generously providing us with a comprehensive list of urls of articles from 2 Russian state owned media outlets, namely Russia Today (RT) and Sputnik, as well as the Ministry of Foreign Affairs of the Russian Federation (MID).
I would also like to thank Swapneel Mehta for his support throughout the research process and to the SimPPL team for feedback in the writing process

References

https://hci.stanford.edu/publications/2023/Karinshak_CSCW23.pdf

Posts

Twitter GitHub

Thanks for reading! Keep exploring. Stay tuned.