Investigating Disinformation Online using OSINT

Jemma Ward
Apr 5, 2023
7 min read

Disinformation and ‘fake news’ have infiltrated our vocabulary in a big way over the last several years. In this blog, we’ll continue to investigate the concepts of information validation and verification – this time, with a focus on investigating inauthentic content.

We’ll look at OSINT tools and tradecraft to identify and investigate information operations and campaigns in the online space, including:

Resources for understanding the online operating environment
Collective tools for key word and hashtag analysis
Pivoting from usernames and images
Investigating disinformation websites and domain metadata

We will also consider some of the challenges facing analysts and researchers who seek to identify and analyse inauthentic content online.

Disinformation in History

In or around 1903, a notorious antisemitic hoax known as The Protocols of the Elders of Zion first appeared in a Russian newspaper called Znamya (The Banner). Over the next few decades, this text was circulated in Russia and the West. In 1921, the London Times debunked the document, showing that it was plagiarised from other texts including a French political satire. Nonetheless, the Protocols were used in Nazi propaganda and schools in to inculcate an antisemitic ideology. Since then, the Protocols has been leveraged by antisemitic groups to continue to incite fear and hatred.

The very fact that inauthentic content published over a hundred years ago is still used to promote ideological narratives and manipulate audiences reflects how pervasive disinformation campaigns can be. Even after ‘fake news’ is soundly debunked and proved to be false (as the Times demonstrated way back in 1921), the content can still be used to sway and influence, and permanently muddy the waters of the information environment.

The Origins of Disinformation

The English term ‘disinformation’ originates from the Russian term Дезинформация, transliterated to dezinformatsiya, which was used as early as the 1920s to describe official dissemination of false information to influence public opinion. Disinformation isn’t a new phenomenon – ‘fake news’ and inauthentic content has been used to influence and manipulate for as long as we can remember.

Investigating Disinformation Campaigns and Inauthentic Content

So how do we begin to identify inauthentic content in the online space? Where do we look? And what are we looking for? Inauthentic content spans a wide range of topics – from macabre urban legends to bizarre conspiracy theories, to perhaps well-meaning but misinformed health advice. For researchers and investigators looking at disinformation campaigns, identifying key terms, narratives, and methods for disseminating and amplifying inauthentic content can not only help to debunk and counter disinformation, but can also help identify the actors that seek to influence the information environment.

Definitions

Disinformation is false information that is deliberately spread to mislead others. Constructed and spread to serve a particular purpose as part of information operations and warfare. The intent of disinformation is to manipulate – the agent of disinformation knows that it is untrue.	Misinformation is inaccurate information that is unintentionally shared. It is often a strategic output of disinformation – directed disinformation leads to the spread of misinformation. The original intent or agenda is unknown to the disseminator.
Propaganda is the use of communications and information to promote a political or ideological agenda. Propaganda is often biased, but it may not necessarily be untrue. Facts may be misrepresented to support or promote narratives about the world	Malinformation is when mainly factual or true information is used to cause harm to a person, country, or organisation. Malinformation may present true information in a disingenuous way to achieve a divisive agenda.

Understanding the Online Operating Environment

Disinformation campaigns might not be new, but the internet has drastically altered the scale and effect of information operations. Having a thorough understanding of the online operating environment is essential for researchers and investigators seeking to understand how, where and why inauthentic content is disseminated online.

When planning an information operations investigation, we should ask which platforms are being used to disseminate and amplify inauthentic content?

Social media penetration varies significantly depending on country. DataReportal publishes reports on digital trends and social media usage, which can help answer questions about platform usage and reach for specific demographics: https://datareportal.com/

ree — DataReportal (2023). Digital 2023 Global Digital Overview. https://datareportal.com/reports/digital-2023-global-overview-report

Collective tools for key word and hashtag analysis

Understanding key terms, phrases, symbols, and imagery that are used to promote and spread disinformation lets us identify specific platforms and accounts of interest and gives us insight into the (often vulnerable) audiences that are being targeted.

Google Trends (https://trends.google.com/) can be used to highlight changing narratives and key topics within the disinformation ecosystem over time – in the below search, we compared the use of three key searches related to COVID-19 vaccine disinformation and conspiracy theories. We can see when topics or messages likely entered the conversation, as well as highlight when particular terms seemed to decline in popularity – this can help us refine our keyword searches across social media platforms to identify new inauthentic content and also help us test our assumptions about prevailing narratives and conspiracy theories.

Collective tools like the Social Media Analysis Toolkit (https://www.smat-app.com/) can be a way to quickly gauge whether unique phrases, keywords, or hashtags are being used to spread or amplify disinformation across different platforms. While public queries are now limited to data more than six months old, the Social Media Analysis Tool can provide useful insights into the timelines of disinformation narratives and campaigns.

In the image below, we looked at the rise of the hashtag #diedsuddenly on Telegram over a nine-month period from February to October last year – prior to the release of the so-named anti-vaccine “documentary” released online in November 2022. This kind of high-level information can help us either narrow or broaden our search parameters on social media platforms.

Hashtag analysis can also alert us to the use of specific techniques by disinformation actors, such as hashtag hijacking. Hashtag hijacking occurs when popular or trending hashtags are leveraged to spread unrelated information (which may be propaganda, disinformation, or just regular old spam). In the example below, neutral hashtags with broad reach are used alongside well-known COVID-19 disinformation hashtags.

ree — COVID-19 conspiracy content posted on Twitter.

Pivoting from usernames and images

We often hear about the importance of pivoting in OSINT investigations – pivoting is a great way of identifying the provenance of inauthentic content, or identifying other accounts and platforms used to spread a message. Once we identify one source of inauthentic content – such as a domain, image, or username – we can use OSINT tools to detect other platforms and accounts of interest.

Whatsmyname.app (https://whatsmyname.app/) is a username enumeration tool that queries hundreds of platforms. In the example below, the handle from a Telegram channel used to share (mainly) COVID-19 related conspiracy theories returns both current and archived account details for multiple platforms. As a result, we can quickly gain an understanding of platforms of interest and identify further data points for our investigation.

Reverse image searching, covered in our last OSINT Combine blog on Image Analysis and Verification, is a valuable method for identifying where disinformation imagery has been shared or posted online. This may also shed light on instances of misinformation, where inauthentic content in image form is reposted by users who are not aware of its origin or agenda.

Investigating disinformation websites and domain metadata

In the Verifying Information Online blog, Kylie reviewed content from the pro-Russian ‘think tank’ Katehon and tested whether the information in its articles about Ukrainian leader Volodymyr Zelenskyy could be corroborated by other sources. Undertaking in-depth evaluation of suspected inauthentic content using a framework like R2C2 (Relevance, Reliability, Credibility, and Corroboration) is an excellent method for detecting and analysing disinformation and propaganda.

Once we’ve identified an online entity as a source of inauthentic content, what comes next? In a disinformation investigation, we may want to glean further information about the online presence of the actor we’ve found, to identify:

Locations
Related organisations or institutions
Social media accounts
Connected domains and subdomains
Historical activity

Website investigation tools allow us to quickly identify connected domains, social media platforms, and historic information for websites used to disseminate inauthentic content.

Urlscan.io (https://urlscan.io/) is a free scanning service that browses to submitted URLs and returns information about domains and IP addresses, website resources and cookies, and outgoing links. We might use urlscan.io to identify the hosting country (based on information like IP address and ASN) of a website, as well as connected or linked domains. In the image below, we can see outgoing links from the scanned URL https://katehon[.]com/en/ to a Telegram account and a Tsargrad media group domain – if needed, we can pivot to investigate content on each of these links.

DNSDumpster (https://dnsdumpster.com/) is a domain research tool that identifies hosts related to a submitted domain. Understanding the underlying infrastructure of an online entity can help us gauge its size and reach and might alert us to pivot points (for example, the Google Analytics ID revealed in the address record below may reveal other sites for the same analytics account, and the .ru mail provider may indicate the domain’s origin).

Builtwith.com (https://builtwith.com/) also returns details about domains, including analytic tag relationships, and visualisations of IP address history. From our query on https://katehon[.]com/en/ we can see a list of domains using the same Google Analytics account, as well as websites that have shared an IP address with Katehon in the past.

As always, we want to verify our findings – virtual hosting allows multiple domains to share one IP address, website information can be outdated or incorrect, and DDoS protection services like Cloudflare can obscure real IP addresses for domains. Look at website content, archives, and related reputable media reporting to validate your findings about disinformation sources.

Challenges for OSINT analysts undertaking Disinformation Investigations

A nuanced understanding of the online environment, along with knowledge of OSINT tools and tradecraft, make OSINT investigators uniquely equipped to identify and analyse inauthentic content online – but evolving technologies makes recognizing, researching and countering online disinformation challenging.

Artificial Intelligence: the growing sophistication of AI machines means that convincing inauthentic content can be generated quickly and easily. Photorealistic images created with image generators like Midjourney add a further layer of complexity to the information environment – especially when we remember that many people view online content on handheld devices, making it tricky to pick up on inconsistencies or visual flaws.

ree — Images posted on Twitter by user @WebCrooner on 8 Feb 2023 – anything seem amiss here?

Scale and pace of messaging: as internet users, we are confronted with huge amounts of information every time we venture online – and some of it is likely to be false. Both state and non-state actors use inauthentic content to divide, manipulate, persuade, and influence, and the integration of bots, trolls and spam networks into their information operations means that they can disseminate and amplify messages at a scale never seen before.

As OSINT analysts and investigators, the sheer amount of data that might be linked to any one disinformation campaign or narrative can be, at best, intimidating, and at worst, overwhelming. We don’t have the time (or the word count) in this blog to introduce the topic of bulk data analysis and visualization methods, but understanding how to collect, clean, filter, and visualize massive data sets can help analysts to make sense of vast amounts of content.

And finally, one of the most daunting challenges for those who seek to investigate and expose disinformation…

Our brains. The continued circulation of a notorious antisemitic text in conspiracy theory and extremist circles, a century after its exposure as a fabrication, is a grim reminder that fact-checks and debunking aren’t a panacea for arresting the spread of disinformation.

Cognitive laziness and confirmation bias both play a role in our ability to accept inauthentic content as true or correct. When confronted with new information, people are liable to accept and trust first impressions, and even double down on their beliefs over time when confronted with more truthful data. Audiences are also more likely to accept falsehoods when they can see accompanying ‘evidence’ – even if the evidence is non-existent, manufactured, or manipulated. As OSINT analysts, ensuring that we always validate and verify our information, and leverage source evaluation frameworks like R2C2, lets us dive into the world of disinformation with our eyes wide open.

Key Takeaways

Disinformation isn’t new, but the internet has made the information environment far more complex and volatile.
Identifying social media platforms of interest and understanding the reach and penetration of platforms in specific locations, lets us more effectively choose which sources to focus on.
Using keyword and hashtag searches can help to identify platforms and accounts of interest, along with tactics and techniques used by actors who spread or amplify disinformation.
Using OSINT tools and techniques to pivot on information of interest can assist with disinformation investigations.
Website investigation tools can help us profile the online presence of entities spreading and amplifying disinformation – but always remember to validate your findings!
The rapidly evolving online environment adds to the existing challenges of countering disinformation and inauthentic content.

To support OSINT collection and analytical capability uplift, please look at our in-person training courses, or our online, self-paced options, including our Detecting, Collecting and Analysing Disinformation course. Alternatively, contact us at training@osintcombine.com to learn about our bespoke training offerings.