Analyzing Article Sharing by User Groups and Individuals

Jul 8
·

––– views

Welcome to the 1nd installment to my "Unraveling Coordinated Behaviour" series. Here I will deep dive into one of the projects I have worked on in SimPPL.

Introduction

Large language models can produce text that is not only more coherent and cheaper to produce, but is also able to persuade humans better than human-generated content. As more people grow aware of the unprecedented ability to cheaply generate highly realistic text and multimedia content, the trust in content posted to the internet is reducing. It is likely to make the job of fact-checkers incredibly challenging–if not impossible–to identify the differences between real and artificially generated content. This poses a great risk during volatile events like conflicts and civic crises where there is a massive influx of information in very short intervals of time and it is impossible to verify every narrative before it achieves viral spread through social networks. Historically, websites like Wikipedia and Iffy News have maintained lists of reliable and unreliable media providers as an effort to introduce some ‘credible’ perspectives into the discourse. These reliability scores are typically determined by evaluating the historical sharing of information by a media organization, albeit suffering from the obvious criticism of “who fact-checks the fact-checkers”, effectively questioning any single authority as an arbiter of truth. In order to sidestep this debate, we take the case of two so-called ‘unreliable’ media providers and assess the nature of information spread that they participate in on a popular microblogging platform. We try to answer the question of “how” their posts are spread which complements the question fact-checkers tend to ask which is “what” information are they publishing.

The flow of information on social media and the role of influential actors in amplifying content are crucial topics to investigate for platforms and researchers, even as we do not make use of reliability scores and other credibility-focused metrics that are based on the content alone. For example, our detection of bot-like behavior is essential for understanding the authenticity of online interactions pertaining to an outlet. While we emphasize that these need not be behaviors that an outlet is driving themselves, we find that they are certainly useful signals as they are correlated with what we believe to be coordinated inauthentic behavior. We provide this analysis as a general method to analyze the nature of spread of information on social media for any media organization, taking the case of two state-backed media providers in Russia, given the need to study the narratives relating to the invasion of Ukraine. To advance the interests of civic integrity, the development of open-access tools for analyzing coordinated activity is critical. These tools, including our platform, will aid in identifying patterns, trends, and impact of coordinated actors, enabling informed decision-making and safeguarding the integrity of public discourse.

Executive Summary

About our Dataset

Types of Tweets

Media Sources of Urls

Coordinated Activity: Engagement at the same second (strictest condition)

Natural Next directions to explore

There are 3 things we need to look into:

  1. To spot the timelines of heightened coordinated activity & possibly identify their causation.
  2. To further analyse coordinated activity participation by aligned users across multiple groups.
  3. To expand the timespan that defines coordinated activity.

Timeline of Coordinated Activity by Top 5 Pairs with On-Ground Events

  • This time series plot represents same-second coordinated tweets between pairs of users, with each dot indicating a specific instance.
  • The alignment of tweets from multiple user pairs indicates the potential occurrence of large-scale synchronized activity involving multiple user groups.
  • Note: In certain instances, line traces representing specific on-ground activities sourced from this Wikipedia page, have been included to provide further insights.

Coordinated Activity (engagements at the same second) by Users Across Multiple Groups

Coordinated Activity: Engagement within 5 second interval

  • Orange: Top 5 User Groups (A1-A5) that engaged in same-second coordinated sharing at multiple instances
  • Blue: Top User Groups (B1-B3) exhibiting coordinated sharing within a 5-second interval, indicating emerging patterns of engagement.

Timeline of Coordinated Activity by User Groups with On-Ground Events

  • This time series plot represents 5-second coordinated tweets between pairs of users, with each dot indicating a specific instance.
  • The alignment of tweets from multiple user pairs indicates the potential occurrence of large-scale synchronized activity involving multiple user groups.
  • Note: In certain instances, line traces representing specific on-ground activities sourced from this Wikipedia page, have been included to provide further insights.

Coordinated Activity (retweets at within 5 second interval) by Users across multiple groups

  • The Line graph indicates Number of followers on a 5k per unit scale.
  • To optimize computational efficiency, a time block approach is being employed to identify potential coordinated tweets among users within a 5-second interval, while acknowledging the possibility of some tweet data loss.
  • Listed count refers to the number of public lists on which a user's account has been included by other users.
  • Having a high listed count can indicate that the user has a strong presence, expertise, or influence in a particular field or topic within the social media community.
Example Instance 1 Example Instance 2

Comparative Analysis of User Groups for different Time Intervals

Article Engagement Analysis

Coordinated Activity (retweets at within 5 second interval) by Users for urls with the least amount of total interactions(tweets)

Anomalous pattern of sharing same urls over large timespans

Small cohort of users propel the engagement of articles with comparatively shorter durations

Disclaimer

Acknowledgements

References

Thanks for reading! Keep exploring. Stay tuned.