Through a new $2 million National Science Foundation grant, scientists at the Cary Institute of Ecosystem Studies, the University of Georgia and North Carolina A&T State University are harnessing the power of machine learning to forecast outbreaks of zoonotic disease.
Each year, more than 1 billion people become sick from Ebola, Zika, SARS and other pathogens acquired from wildlife, livestock and other animals, the announcement said. Prevention relies on the ability to predict when and where pathogens are likely to make the leap from animals to people.
Barbara Han, a disease ecologist at the Cary Institute, is leading the five-year study. She explained, “We want to help shift society from a reactive to a proactive approach to managing zoonotic disease. Instead of responding to outbreaks, let’s try to stop them from happening in the first place. Using big data as a potential surveillance tool is an exciting new step toward prevention.”
Funding will enable the team to bring together information on pathogens, potential animal hosts and environmental factors known to facilitate disease transmission, with the goal of developing innovative methods for mapping when and where the next major zoonotic disease outbreak might occur.
“We are creating models which draw ‘boundaries’ around which species can host which pathogens, which pathogens can pass from animals to humans and what combination of environmental factors facilitate spillover and human-to-human transmission," John Drake of the University of Georgia said. "On the basis of these biological properties, we hope to pinpoint where new diseases will emerge in the future.”
Phase one of the study involves building predictive statistical models that will help the researchers identify traits common among animals that carry disease and the pathogens and parasites that cross the species barrier.
“We are looking at data that describe hosts, pathogens and their environments to determine which combinations of these features presage disease being realized on a global landscape,” Han said.
Models are built using extensive data sets on the physical and life history traits of host species and known pathogens. Host/pathogen pairings are then linked to the geographical locations with suitable environmental conditions. Also considered are conditions surrounding documented disease outbreaks to determine what factors were at play when that disease broke out.
Suzanne O'Regan of North Carolina A&T State University explained, “By using data that are global in scale, we are seeking to reveal generalizable features of ‘good’ disease carriers. Over 50 life history features are being incorporated into models for most mammal groups.” These include data on animals’ physical characteristics, metabolic and reproductive rates, range of diet and timing of daily activity — i.e., whether the animal is active primarily during the day, at night or at dawn and dusk.
On the pathogen side, the team is interested in: whether a pathogen is able to survive in a given host and environment, the mechanism by which the pathogen is transmitted between hosts and whether it exhibits sustained transmission between people, as opposed to a single "dead-end" transmission from animal host to human.
Environmental features broadly consider temperature, precipitation, seasonality and biome. The study will also encompass country-specific socioecological factors such as gross domestic product, public health infrastructure and investment in research and healthcare — all of which bear important implications for how effectively a country can manage disease prevalence and respond to an outbreak.
The second sub-project will investigate how diseases move dynamically within a system. Once the traits of hosts, pathogens and their environments — and the relationships among them — are known, the team will incorporate these into mathematical models to reveal how disease dynamics might play out in animal populations over time. This approach accounts for traits such as life span and reproduction rate, which directly affect how fast a pathogen can spread via a particular host.
Han explained, “The novelty of this work is in bringing biological realism via machine learning into a classic body of theory, leveraging large sets of biological data available to us. These tools merge data mining and machine learning with established methods of studying disease dynamics to help us think carefully about what’s distinguishing animal groups from each other in terms of zoonotic disease and, eventually, for risk of human spillover and epidemics.”
The team also plans to use the models and techniques developed in this project to respond to zoonotic disease outbreaks that might occur during the course of the study.