Science Inventory

Pandemic Data Science Hack: Lessons to learn from an eco-epidemiological model of global COVID-19 factors analysis

Citation:

Goldsmith, M. AND R. Tornero-Velez. Pandemic Data Science Hack: Lessons to learn from an eco-epidemiological model of global COVID-19 factors analysis. International Society for Exposure Science (ISES) 2021 Virtual Meeting, Virtual, NC, August 30 - September 03, 2021.

Impact/Purpose:

This abstract describes a talk developed for an invited International Society for Exposure Science 2021 conference session titled " ISES Symposium: Harnessing the power of artificial intelligence and non-traditional modeling and data to advance exposure science”  

Description:

Many epidemiological models are developed for extrapolative and predictive purposes, to inform the geographical breadth, and the temporal change in the spread of disease, and ultimately to anticipate locations of disease burden and to inform public health disease mitigation strategies. Traditional ecological epidemiological models informing the “where” and “when” of disease burden, but often focus less on factors analysis that inform “how” a disease may propagate. The objective of this modeling exercise was to use disparate data sources within a machine learning context (Genetic Algorithm – Multiple Linear Regression or GA-MLR) to model covid19 disease prevalence. Specifically, we used global COVID-19 disease case count by country data in the earlier and later stage of the epidemic (May 2nd , 2020 and May 4th, 2021) in addition to data from disparate sources and types (genetic, geographical, sociological, cultural, life-style and dietary). Using this strategy, we built 15 early and 15 late pandemic ecological epidemiological multi-variable models (multi-variate regression models) to predict number of cases of COVID-19 and carefully evaluated the sign and magnitude of the variables to better understand and elucidate disease associations/factors.  A key finding was that in the early and more recent part of the pandemic, numerous factors were positively correlated to the endpoint (Median BMI, median age, median population density, and per capita daily “spice” intake) whereas others were negatively correlated (Minimum interpersonal distance, mean cigarette consumption, R1B haplotype prevalence). Additionally, all factors other than median BMI appear to contribute more during later versus earlier pandemic, but surprisingly cigarette consumption also became a key contributor.  Another key finding is that at both times the USA was an outlier with far more actual cases than predicted using these approaches, however India became the second place outlier in the later stage of the pandemic, with significantly more cases than would have been predicted; these outliers are suggestive of societies with the inability to social distance, one due to personal and political choice, and the other due to overcrowding and overpopulation. The lessons learned from this eco-epi model suggest, even in the absence of medical intervention (i.e. vaccines, or antibody therapies or small molecule therapies), that there are still rational viral spread mitigation strategies that this model can inform, which could extend to the “individual” level. Although some relationships appear to be counterintuitive, obscure, confounding, and non-actionable (R1B haplotype, cigarette smoking, median age) the correlation between number of cases and mean BMI (positive), population density (positive) and mean interpersonal distance (negative) suggest reductions of co-morbidities, weight-management, avoiding crowds, and staying socially distant has a large and controllable association to global disease progression of Covid-19. The views expressed in this presentation are those of the authors and do not necessarily represent the views or policies of the U.S. Environmental Protection Agency.

Record Details:

Record Type:DOCUMENT( PRESENTATION/ ABSTRACT)
Product Published Date:09/03/2021
Record Last Revised:02/14/2022
OMB Category:Other
Record ID: 354113