Open source techniques and how to use them

6 min read

CIR

CIR 's photo

What are the initial steps in an open source investigation? Which open source techniques are used by our analysts?

Imagine you hear of an explosion on the outskirts of Kabul on the news. As an analyst, your immediate questions might be: What happened and how? Are there any victims? Where and when did the incident(s) occur? Who are the perpetrators?

The close analysis of any videos or photographs taken of the explosion can allow analysts and investigators to unlock a realm of information such as landmarks, the surrounding environment, the terrain or weather, people in the frame, or any dialogue exchanged. By collecting such information, analysts are able to not only confirm that an incident occurred in the first place, but can begin to build an understanding of associated events

Uncovering executions in Panjshir

In October 2022, Afghan Witness conducted a major investigation into evidence of summary executions in Panjshir. Investigators found the exact site where Taliban fighters were filmed executing five men to a mountaintop in Panjshir’s Dara District. AW’s findings later conclusively proved that the same group of Taliban could be linked to the executions of another five men.

The investigation was covered by international media and cited in a report by the UN Special Rapporteur, who stated that the findings aligned with multiple sources, confirming a pattern of extrajudicial killings of individuals affiliated with the National Resistance Front by the Taliban. The investigation demonstrates the power of open source investigation to complement human testimony – providing a detailed timeline of the associated events, and even going as far as pinpointing the group responsible.

 

Figure: Footage of the site where Taliban fighters were filmed executing five men (left), Geolocation of site using Google Earth [35.338321, 69.697990] (right)

Step 1: Collecting information

Open source investigations usually begin at the search engine, most often on social media platforms, where investigators can identify relevant user-generated content (UGC) such as photographs and videos. To make information collection more efficient, analysts use ‘advanced search operators’ – commands that enable them to refine searches for more specific results. These include:

  • Grouping search terms into long search strings using brackets.
  • Adding Boolean operators, defining terms such as ‘AND’, ‘OR’, or ‘-’ that help broaden, or limit, searches.
  • Adding a timeframe to filter out any unnecessary information.
  • Using platforms such as Tweetdeck to monitor search results for events over a period of time.
  • Conducting a reverse search on an image to find where it is derived from on the internet. Checking whether content is old or new is a crucial step before investigating a claim further as this flags up old footage that has been shared out-of-context or framed inaccurately. Content that appears to be new – or has not been recorded previously – can be analysed and investigated further at a later stage in the process. Our analysts usually use Google lens, INVID, Yandex, Bing reverse image search, or TinEye.

 

When collecting information, you should also consider factors such as spelling variations, typos, relevant languages, transliterations, slang or informal language, and the type of search engine you are using, as these can all influence search results.

 

Step 2: Protecting information

Online content can quickly be deleted by creators or platforms, so securely archiving the data that you find in its original state means it can be used if it is ever needed to hold perpetrators to account. AW uses a process known as ‘hashing’ to preserve data: any data that enters the AW database is archived upon entry by an auto-archiver which stores the media on a secure server, and also automatically gives each piece of data a related fixed-length string of letters and numbers known as a ‘hash’ by using a special algorithm called a ‘hash-function’. If the data were to be tampered with, this would be visible in changes to the unique value string initially assigned to the data when it entered the database, so hashing data proves it hasn’t been tampered with if it is ever needed as evidence.

 

Step 3 & 4: Analysing and Verifying information

Once initial steps to collect and preserve the data have been taken, analysts will analyse the content for clues on what is happening, and use open source techniques to verify as many details as possible. The process of analysing available information can be complex and time-consuming, sometimes involving going through a video frame by frame in order to gather information about the location, timing, and people involved in an incident.

When AW describes a piece of content as “verified”, it means that investigators have been able to confirm, with a high degree of confidence, the location and date of a piece of footage or a photograph. Occasionally, analysts are also able to verify other details, such as perpetrators or victims.

Visual verification can be done by geolocation and chronolocation. Geolocation is the process of matching any visible buildings, landmarks or identifiable geographical features to Google Street View or satellite imagery to allow you to pinpoint the precise location where an image or video was captured and check if this aligns with the claimed location. Going one step further, chronolocation is the process of analysing shadow and sun placement to narrow down the timeframe an incident took place in. AW analysts use tools such as SunCalc for chronolocations.

Geolocating and chronolocating images and videos can provide important information about where and when an incident occurred, and potentially who was involved. However, it’s important to cross-check and triangulate these findings with other sources to confirm events with a higher degree of confidence. AW analysts strengthen and corroborate visual evidence using various methods:

  • Assessing the credibility of the source: Is the event being covered by credible news agencies, solely by individuals online, or by government propaganda accounts? Assess what this means for the veracity of the claims being made.
  • Comparing news reports against statements from other sources, such as the Taliban or resistance groups, as well as online statements by hospitals (usually regarding the number of victims) and eye witness interviews.

This practice allows us to determine how confident we are in our assessment of what happened, when, how and why it happened, and who (if anyone) was impacted.

For more detailed information on processes of analysis and verification, read how AW analysts used open source to analyse information that narrowed down the exact location of the burial site of Taliban founder and first emir Mullah Muhammad Omar here. Otherwise, there are plenty of useful online guides and YouTube tutorials aimed at beginners which not only explain the technical aspects of geolocation, but how to analyse an image or video for clues.

 

Challenges for OSINT investigations

It is important to note that not every claim found online will be verifiable. Sometimes information may not surface for reasons such as self-censorship, poor internet access, or if it poses risks to personal safety. AW has encountered this recently in the case of sharia punishments under Taliban rule in Afghanistan. While the Taliban-led Supreme Court has announced the punishments on its X page, visual, verifiable evidence has been limited – likely a result of the group’s previous warnings against photographing or filming the events.

As interest in open source investigation grows, there is also an ongoing discussion by practitioners and researchers on best practices to help reduce its misuse. All actors working with data and information have the responsibility of conducting their actions in an ethical manner by ensuring they follow guidelines on consent, control, and transparency.

AW follows the Berkeley Protocol on Digital Open Source Investigations published jointly by the UN Human Rights Office (OHCHR) and the Human Rights Centre at the University of California, Berkeley. The Berkeley Protocol “identifies international standards and provides guidance on methodologies and procedures for gathering, analysing, and preserving digital information in a professional, legal, and ethical manner”. These guidelines ensure that open source investigators always trace and attribute online content to its original source, comply with legal requirements and ethical norms, minimise the risk of harm to themselves and their sources, and evaluate the credibility and reliability of those sources.

 

To conclude:

Open source methods have changed the way research and investigations work. Practices of data collection and analysis previously only used by intelligence agencies or law enforcement authorities are now accessible to journalists, activists, and analysts, allowing them to conduct investigations, debunk misinformation, and reveal human rights abuses.

Though verifiable open source data on human rights violations and security incidents may only be the tip of the iceberg of events occurring in Afghanistan, when used in conjunction with the work of journalists and organisations on the ground, open source methods can help expand the monitoring of events in Afghanistan and help to strengthen accountability mechanisms.

Share Article