The Internet’s Geography

Making sense of the internet as a physical entity using sensor data from IODA, GreyNoise, and Shodan

Jonathan Peyster
Tales From Decrypt

--

The internet became a less ethereal thing for me in 2006 when an earthquake off the coast of Taiwan severed an undersea internet cable connecting users in Mainland China to the web for about a month— including American students living in Beijing like myself. The Chinese internet of that era was slow and wasn’t widely used but it was a lifeline for foreigners seeking to stay connected with people and information back home. Having my digital world disrupted by an event in the physical world in such a dramatic fashion helped me realize that the internet is shaped by the footprint of its physical infrastructure — something real that can be studied as it changes over time like a living creature.

I started out trying to understand the internet through a geopolitical lens before working my way backwards to how it works on a technical level and then how this can be measured empirically with internet scanning data. Using primary sources to get closer to the ground truth of what is online has been an eye opening experience for me and I would like to continue to use this blog to share some of what I’ve learned along the way — starting with a few key questions about the internet and some of my go-to sources for getting answers.

The Internet’s Health via IODA

KEY QUESTIONS: Which hosting companies, regions, or countries are experiencing disruptions in connectivity and how has this changed over time?

IODA, or Internet Outage Detection and Analysis, is a system developed at UC San Diego’s Center for Applied Internet Data Analysis (CAIDA) and now managed by Georgia Tech that fuses four types of data to detect and plot internet outages:

  • Border Gateway Protocol (BGP) — identifies disruptions in the routing tables that direct how traffic flows between two endpoints which amounts to evidence that the internet’s road map was altered
  • Active Probing — measures and records responses to regular pinging of IPs to detect deviations from mean which essentially automates the process of manually checking whether websites in a specific place are reachable
  • Network Telescope — detects decline in internet background noise (i.e. internet scanning by bots) originating from specific blocks of IPs with unexpected silence indicating a disruption in connectivity
  • Google Search — looks for unexplainably sudden drops in usage of Google from a specific area as a means of detecting or adding context to outages

By combining 4 distinct types of data and looking for instances where more than one indicate an outage has occurred, you can not only get a higher accuracy indicator but can also get more context on the nature of the event. Doug Madory from Kentik is a great resource for research using IODA and other compelling outage and censorship detection data sources like the Open Observatory of Network Interference (OONI).

IODA detecting an outage in Sudan and pinpointing a specific Autonomous System at fault

The Internet’s Pulse via GreyNoise

KEY QUESTIONS: What vulnerabilities are threat actors seeking to exploit? How are they attempting to do it? Which IP addresses and countries are these attempts coming from? How active have they been and when did they start?

  • GreyNoise deploys a collection of sensors around the world that lure and record malicious traffic by impersonating attractive targets — essentially, honeypots on steroids
  • Through its network, you can not only get macro level indicators of the threat landscape such as when a vulnerability is being mass exploited but you can also get micro level intelligence on malicious IPs to block or investigate
  • GreyNoise’s observed activity data is helpful because it allows you to get beyond what an IP is doing and closer to a sense of the tactics, techniques, and procedures employed by the threat actor it belongs to
GreyNoise can be used to identify exploit trends but also provides extensive info on each malicious IPs it detects

The Internet’s Shape via Shodan

KEY QUESTIONS: Where are things online? What are they? Who do they belong to? How are they configured? How are they vulnerable to attack?

Shodan conducts an ongoing census of the internet by collecting data provided by internet-facing IP addresses when a connection attempt is initiated. On a macro level this makes it a great tool for understanding the internet’s shape but on a micro level it can serve as a useful proxy for what a threat actor might be able to discover about a network through open source intelligence gathering. Learning to understand how different types of devices and servers appear to an internet crawler is also an excellent way of experiencing the good, bad, and ugly of how networks are built and secured.

My last blog post goes in depth on using Shodan to analyze the deployment of Chinese surveillance systems across the U.S. so be sure to check that out if you are looking for an example of how this tool can be used.

Facet Analysis is a highly useful tool for quickly querying Shodan’s database for insights

--

--

I am a China SME by training and cybersecurity obsessive by calling who led development of Dataminr's cyber capabilities for over 5 years in my last role.