Skip to main content

How to track a killer virus with your phone

A living database of smartphone locations—designed and shared to support ongoing pandemic research

August 4, 2020

Author

virus tracking key
Jake MacDonald/Minneapolis Fed

Article Highlights

  • To aid pandemic research, scholars have built an ongoing database on human movement and social contact
  • Data are from time-stamped, geolocated smartphone pings, indexed for location and for exposure to other devices
  • Indexes show sharp declines in movement and contact in March 2020, with significant regional and demographic variation
How to track a killer virus with your phone

“Shelter in place.” “Social distance.”

These simple phrases express something deeply profound: Human behavior to defend against a deadly infection. Their inverses, “human movement” and “social contact,” convey equally weighty concepts: the likely route and speed of viral transmission.

Mapping that route and measuring its speed are the objectives of an ongoing project by Institute visiting scholars Jonathan Dingel of the University of Chicago and Kevin Williams from Yale, along with three colleagues. In a recent Institute working paper, they describe a rich data set they’ve created expressly for measuring human movement and social contact in the United States. And they make their data and analytical tools publicly available so that other researchers can readily use them for pandemic-related research.

The data are pinpointed, time-stamped pings emitted by smartphones, the highly personal devices that most Americans carry, almost always and everywhere. By geolocating and clocking each ping, the researchers determine each phone’s whereabouts: Where is it? What time is it? Is it in a different location than when it last pinged? (A rigorous research protocol protects phone user privacy.)

A phone that doesn’t move for days on end suggests an owner sheltering in place—whether by chance or intention. But pings that leave a trail show, like breadcrumbs, that its owner was on the move through time and space. The scholars also gauge each phone’s proximity to other phones, yielding evidence of potential interactions among people.

Movement and proximity are summarized by separate indexes, and the paper traces the paths of each index to paint a portrait of the nation’s population during the first months of the pandemic. Where and when did we move, and were we close to others?

The new database and indexes have major potential as real-time roadmaps of the American pandemic.

In brief: Both indexes show major declines in travel and personal visits in March and April 2020, but regions varied significantly. Travel from New York County to other counties collapsed in March, but not from Houston (Harris County) to elsewhere down south and southwest. Phone owners from areas with highly educated residents decreased travel and social contact at disproportionately high levels.

Preliminary findings discussed in the paper are intriguing in themselves, but even more so as indicators of the database’s power. For epidemiologists, economists, other researchers, and policymakers who seek information about how people are moving in relation to one another, and therefore how the virus may spread, the new database and indexes have major potential as real-time roadmaps of the American pandemic.

Do the data represent U.S.?

The paper begins by describing database details—data sources, how the researchers structure them, criteria for selecting and rejecting devices, and locations to build their sample, and how they create the indexes. They also compare their database to more conventional sources—the U.S. Census, TSA records, and IRS filings, for example.

The phone data come from a private firm that collects and analyzes digital device data. GPS info is gathered from pings emitted whenever a smartphone application requests location information. The firm locates those data on a U.S. map that the researchers divide into “venues”—buildings like businesses and homes, and outdoor locations like parks. When a phone pings near a venue, it constitutes a “visit.” Each visit becomes an entry in the database.

The researchers are meticulous in excluding extraneous or unreliable data, and in reporting limitations and selection criteria, all in an effort to maintain transparency for other researchers. One full appendix, for example, is devoted to describing the algorithm used to assign a smartphone owner’s “home.” (In brief, home locations are “where devices repeatedly spend time at night.”) They’re also extremely careful to protect user privacy, going to great length to ensure anonymity by, for instance, careful exclusion of certain venues and providing data at larger geographic or time units to ensure phone owner anonymity.

Where we live, and when we travel are, for the most part, well-captured by the collective pings of our phones.

Because phones are not people, there’s reasonable concern that their pings don’t accurately represent where their owners really are and with whom they share space. Moreover, not all Americans own a smartphone, and some demographic groups are more likely than others to have them. About 81 percent of American adults are owners, according to the Pew Research Center, but while almost all 18-29-year-olds have one, only 53 percent of those over 65 years do.

The scholars document that, despite these limitations, their database is broadly representative of the American population in terms of residential characteristics and movement patterns. Where we live and when we travel are, for the most part, well-captured by the collective pings of our phones. One example: There’s a high correlation between the share of devices that moved into a new state between 2017 and 2018 and the share of new residents who filed IRS tax returns from that state. “Overall,” they conclude, “the patterns documented suggest the potential of broadly representative smartphone data for use in economic research.”

Introducing the indexes

They then describe how these indexes evolved during the first months of the pandemic. It’s a fascinating picture: the evolution of social response to ongoing biological threat.

The researchers then create two indexes. The “location exposure index” (LEX) maps phone location over time: where a phone is, county by county, state by state. The “device exposure index” (DEX) tracks proximity to other phones—are they in the same commercial or public venue as another phone?

Both LEX and DEX are defined with the pandemic in mind. LEX describes the share of phones in a given location that pinged from elsewhere during the prior 14 days, the virus incubation period. In short, it’s the fraction of potentially infectious people who have moved between counties (or states). And DEX captures overlapping visits to venues on the same day. (Not same hour, since the virus can remain viable in the air and on surfaces for a considerable period.)

They then describe how these indexes evolved during the first months of the pandemic. It’s a fascinating picture: the evolution of social response to ongoing biological threat.

The index shows sharp overall drops during March in interstate travel, but particularly in long-distance travel. Travel between states more than 1,500 miles apart plummeted.

For instance, on four national maps, dated at the end of February, March, April, and May, the scholars plot the fraction of phones that had pinged during the previous 14 days in Manhattan, an early COVID-19 epicenter. The February 29, 2020, map documents substantial nationwide exposure to incoming New York County visitors. By the end of April, the map reveals dramatic decline in travel from Manhattan. A month later, travel from the city had increased slightly. Analogous maps of exposure to Houston travelers show a similar decline in travel to the East Coast, but little-to-no decline from Houston to elsewhere south or southwest.

A chart of state-level LEX values in the first half of the year, sorted by the distance between states, shows sharp overall drops during March in interstate travel, but particularly in long-distance travel (presumably by airplane). Travel between states more than 1,500 miles apart, and especially to Alaska or Hawaii, plummeted in early March and had barely begun to recover by May 30.

Social contact, or lack thereof

The DEX maps tell a similar story. By late March, overlapping visits in U.S. counties declined across the nation to just one-third the levels seen in early February. By late April, visits had increased somewhat across the country, but even through late May, they remained lower than in early February, particularly to New York City, California, and Washington. A few spots—notably vacation destinations like North Carolina’s Outer Banks and Panama City in Florida—saw strong upswings in device exposure by late May.

The paper also analyzes trends in phone exposure by average educational attainment, and by race of residents, in different U.S. Census neighborhoods. Prior to the pandemic’s onset, phones from areas with more highly educated residents were more exposed to other phones than the U.S. average, but while exposure for all groups fell during March, it fell proportionately more for residents from neighborhoods with more college grads.

By late March, overlapping visits in U.S. counties declined across the nation to just one-third the levels seen in early February.

The economists examine phone exposure by race and ethnicity as well. Before the pandemic, phones residing in census blocks with more Black, Hispanic, and White residents had similar levels of exposure to one another, while Asians had higher DEX levels. Exposure dropped for all groups during March, converging at low levels. This limited variation after mid-March “may imply a limited role for heterogeneous exposure rates” in explaining demographic differences in infection and death rates, note the authors.

A living database

Ultimately, the paper serves as an introduction to a powerful living database that reveals how we’re responding to the threat of contagious disease and death, whether by limiting our travel and visits with others or returning to life as we once knew it. By building and sharing this database, maintained with daily updates, the five scholars have provided a valuable tool for others to adapt for their own research and policy aims.

Douglas Clement
Managing Editor

Douglas Clement is a managing editor at the Minneapolis Fed, where he writes about research conducted by economists and other scholars associated with the Minneapolis Fed and interviews prominent economists.