Active & Deleted User Analysis along with Network Graphs for dissemination of information
Welcome to the 2nd installment to my "Unraveling Coordinated Behaviour" series. Here I will deep dive into one of the projects I have worked on in SimPPL.
As the second installment of "Unraveling Coordinated Behaviour" series, this article continues to explore coordinated activities.
It involves recap of previous analysis, which focused on specific user groups exhibiting coordinated sharing behavior, with the goal of identifying and understanding coordinated activities within these groups.
Narrowed the focus to seven key user groups to increase confidence in detecting coordinated behavior.
New insights include active user analysis, revealing the number of active users in different tweet ranges.
A small cohort of highly engaged users stands out, showcasing a unique pattern of behavior.
Some users display anomalous tweeting activity throughout the day, indicating potential bot-like behavior.
Explored these patterns in detail, providing examples from various user groups.
Deleted user analysis provides valuable signals for identifying common narratives and networks, even with data constraints.
Investigated the tweet-sharing behavior of deleted accounts, offering insights into the promotion of specific agendas and information dissemination.
Network analysis reveals the dynamics of interactions and influence within coordinated activities.
Examined the networks around shared URLs and identify key influencers, contributors, and trends.
The refined network analysis focuses on prominent influencers, egos of communities, and users with high toxicity scores, shedding light on influential and toxic nodes within these networks.
This deep dive uncovers intriguing patterns and connections, providing valuable insights into coordinated behavior on social media platforms.
The analysis aims to contribute to a comprehensive understanding of how information spreads and influences discourse within specific user groups.
The purpose of streamlining is to narrow down the analysis to specific user groups, specifically
A1-A4, which exhibit multiple instances of same second sharing, and
Groups B1-B3, characterized by a minimum of 12 instances of 5-second interval sharing.
By focusing on these 7 user groups, we can increase our level of certainty that they have indeed engaged in coordinated sharing.
Subsequently, our objective shifts to exploring the extent of their coordinated sharing within a 10-second interval.
Upon expanding the timespan to 10 seconds, our analysis reveals a notable increase in the detection of coordinated activity within User Groups B1, B2, and B3.
We see an unprecedented rise in the number of coordinated tweets shared by User Group B3 when expanding the time interval to 60 seconds.
Above are 2 examples of users which display anomalous tweeting activity spanning almost all hours of the day.
Note: User 4484 is part of User Group B1 which engaged in 5 second coordinated activity. Based on the observation from it’s Tweets per Hour distribution, it indicates bot-like behaviour with significant activity spanning all hours of a day.
There are some users eg User 66550 which display higher than average sharing on select hours of a day. Note that User 66550 is part of User Group A4.
It is interesting to note that during Hours 12-14, it has exceptionally high engagement which is unmatched when compared to all the users belonging to Groups A1-A4 and B1-B3, i.e. other users in these groups have cumulatively shared less than 300 Tweets during any particular hour of a day.
Above 2 users belonging to Group A1 exhibit nearly the same cumulative tweets per hour during the entire timespan of data collection.
For some users, Tweet activity is limited to certain time periods.
Interestingly, Users 42190 & 68965 which are part of User Group A1 have nearly same tweet sharing activity as early as 13 April 2021 sharing the article titled “US should stay clear of Crimea & Black Sea ‘for its own good,’ says Moscow's deputy FM, as American warships move closer to Russia”.
User 837, which is part of User Group B1 also displays this trend where significant tweet sharing only occurs from February 27 and continues till April 2.
User 8799 has only 3 tweets in 11 months from March 17, 2021 till February 2022. Then interestingly starts sharing more than 10 Tweets on average, consistently from February 24, 2022 (coincides with a major event i.e. start of “special military operations” by Russia) which lasts till March 31, 2022.
This suggests a deliberate effort to shape public narratives and influence discourse surrounding particular events and news cycles, aligning with its perspectives and objectives.
Peak sharing by User 8799 of 33 tweets occurs on March 5 when Russian armed forces announced a ceasefire to allow around 200,000 civilians to evacuate Mariupol, which lacked water and electricity. (source Wikipedia)
Note: While we were constrained in our ability to collect data from deleted accounts due to rate limits, we present a preliminary analysis of the data that we did collect, finding that they yield important signals to identify common narratives promoted by such networks. We collect and spotlight some of the patterns we found among these accounts as a roadmap towards building a more comprehensive analytics pipeline for the reader's benefit with the explicit disclaimer that the data on account-level interactions are only a fraction of their potential total interactions on Twitter.
The goal is to analyse the top 0.1% of our dataset's Deleted Users which have the highest number of tweets and their engagement pattern throughout their lifespan on Twitter
We see slightly increased activity in December 2021 when news of Troop deployment started to float to unusually high activity from February to March 2022 when a full fledged 'special military operation' was active.
Last high activity for all above accounts is noticed around March 31.
After this, 2 of these accounts are deleted / deactivated by Twitter at exactly the end of March while other 2 accounts lay low until April with nominal activity.
Only 2 out of the 10 urls had more tweets by Deleted Users only than the mean. This is important due to the fact that there were around 20.6k Deleted users in our dataset.
All these urls had more than 100 overall interactions.
However overall interactions for only a few urls had crossed the 97 percentile mark possibly indicating the presence of unsuccessful campaigns.
While this graph is based on all urls tweeted by deleted users, it would be interesting to contrast this with urls involved in the 5 second coordinated activity by groups having Deleted Users.
Interestingly, all these urls belong to the top 3 percentile. This further solidifies the claim that these urls were in fact involved in coordinated activity
Only 1 article had deleted user interactions more than the mean of overall interactions (for all urls, by both active and deleted users).
For other urls, just the deleted user interactions were more than half of the mean overall interactions.
A clear pattern can be noticed that a Deleted User first posts an article url and then post by the Deleted user receives further engagement from a set of users.
Then after a certain period, account is deleted. If we are certain that account was deleted by the profile owner, then as mentioned earlier, it potentially hints at an attempt to hide or cover up the dissemination of information.
Encircled users have high number of posts, maximum by Deleted User 73572 with over 99 tweets throughout its lifespan.
It can be seen that out of 5 users having maximum retweets for content posted by User 19702, 3 are deleted users and we recognise User 27148 from earlier plots.
New ego influencers can be seen which have a dedicated network of their own.
Similar pattern of Deleted users having a set of alters (users) which engage with their posts.
We have calculated toxicity scores for posts using Perspective API, which provides a score ranging from 0 to 1, quantifiying level of toxicity in the tweet text.
Toxicity of a url is calculated by taking the mean of toxicity scores of all the tweets that contain the url.
The analysis presented in this research was conducted by Jhagrut Lalwani during his affiliation with SimPPL from October 2022 to May 2023, and subsequently reviewed by the team.
This report presents the preliminary insights from our analysis of coordinated behavior in the sharing of articles from two state-backed media outlets with the intention of highlighting the sharing patterns we observed that boost certain narratives, and the nuanced behavior of groups of accounts that participated in promoting similar articles.
We originally planned to perform a similar analysis on articles from an expanded set of media sources to provide a broader perspective on coordinated sharing. However, due to the challenges in expanding this research on Twitter given rate limits and ultimately removal of academic API access from the platform, our ability to collect further data on coordinated networks sharing articles from other media outlets was severely restricted.
On behalf of SimPPL, I extend my gratitude to Ippen Digital (DE) and The Times and The Sunday Times (UK) for their partnership in creating Parrot Report.
I would like to express my thanks to FOCUS Data Project for generously providing us with a comprehensive list of urls of articles.