2009 Progress Report: Research Project B: Healthy Pregnancy, Healthy Baby: Studying Racial Disparities in Birth Outcomes

EPA Grant Number: R833293C002
Subproject: this is subproject number 002 , established and managed by the Center Director under grant R833293
(EPA does not fund or establish subprojects; EPA awards and manages the overall grant for this center).

Center: Southern Center on Environmentally Driven Disparities in Birth Outcomes
Center Director: Miranda , Marie Lynn
Title: Research Project B: Healthy Pregnancy, Healthy Baby: Studying Racial Disparities in Birth Outcomes
Investigators: Williams, Redford , Ashley-Koch, Allison , Auten, Richard , Maxson, Pamela , Miranda , Marie Lynn , Reiter, Jerome , Swamy, Geeta
Current Investigators: Williams, Redford , Ashley-Koch, Allison , Gibson-Davis, Christina , Maxson, Pamela , Miranda , Marie Lynn , Reiter, Jerome , Swamy, Geeta
Institution: Duke University
EPA Project Officer: Callan, Richard
Project Period: May 1, 2007 through April 30, 2012 (Extended to April 30, 2014)
Project Period Covered by this Report: May 1, 2009 through April 30,2010
RFA: Centers for Children’s Environmental Health and Disease Prevention Research (2005) RFA Text |  Recipients Lists
Research Category: Children's Health , Health Effects , Health


The central objective of the Healthy Pregnancy, Healthy Baby Study is to determine how the interaction of environmental, social, and host factors contributes to disparities in birth outcomes between African-American and white women in the American South. There are four specific aims:

  1. Conduct a cohort study of pregnant women in Durham, NC designed to correlate birth weight, gestation, and birth weight x gestation with environmental, social, and host factors;
  2. Develop community-level measures of environmental and social factors by inventorying neighborhood quality and the built environment in partnership with local community groups;
  3. Create a comprehensive data architecture, spatially resolved at the tax parcel level, of environmental, social, and host factors affecting pregnant women by linking data from the cohort study and neighborhood assessments with additional environmental and socioeconomic data; and
  4. Determine whether and to what extent differential exposures explain health disparities in birth outcomes by applying innovative spatial and genetic statistical methods to:
    1. Identify environmental, social, and host factors that cluster to predict birth outcomes in the entire sample,
    2. Determine whether these clusters are more or less present in African-American versus white populations and quantify the proportion of health disparities explained by differences in cluster frequency, and
    3. Identify environmental, social, and host factors that cluster to predict birth outcomes within the African-American and white sub-samples and compare these clusters across racial groups.

Progress Summary:

As of 4/30/10, 1738 women have been enrolled in the study. Women are recruited from Duke University Medical Center (DUMC) and the Durham County Health Department’s prenatal clinic at Lincoln Community Health Center. Demographic data indicate that we are successfully recruiting women who are most at risk for adverse pregnancy outcomes, particularly low-income, low educational attainment, and non-Hispanic black women.

The following information is collected from participants in the Healthy Pregnancy, Healthy Baby Study:

  • Psychosocial measures include: CES-D, perceived stress, self-efficacy, interpersonal support, paternal support, perceived racism, perceived community standing, pregnancy intention, John Henryism Active Coping Scale, NEO Five Factor Inventory of personality.
  • Environmental exposure survey measures include: short survey on fish consumption, smoking pattern and exposure to second-hand smoke, and drinking water source.
  • Maternal and neonatal medical record abstraction includes: detailed pre-pregnancy medical and social history, antepartum complications, birth outcomes, and neonatal complications.
  • Blood samples for genetic and environmental analysis to assess candidate genes related to environmental contaminant (nicotine, cotinine, cadmium, lead, mercury, arsenic, and manganese) metabolism, inflammation, vascular dysfunction, and stress response.
  • Cord blood and placental samples are currently being stored for future genetic analysis and evaluation of activity at the maternal-fetal interface.
We have been highly successful in collection of participant-level data as well as biological samples, with greater than 90% attainment of maternal blood sample for genetic and environmental analyses. Collection of cord blood and placental samples, which began in June 2007, has also been successful with approximately 763 delivery samples collected.

All maternal data are georeferenced (i.e., linked to the physical address of the mother) using Geographic Information System (GIS) software. The Healthy Pregnancy/Health Baby Study also draws on an in-depth neighborhood assessment designed to capture both built environment and community-level social stressors and community resources. The cohort study and neighborhood assessment data are spatially linked to extensive environmental and demographic data at a highly resolved spatial scale.

To date, we have generated genotypes on 1,243 blood samples from pregnant women for 405 Single Nucleotide Polymorphisms (SNPs) in 51 genes. Candidate genes include those involving human environmental contaminant clearance (heavy metals and environmental tobacco smoke), infection and inflammation (cytokines, chemokines, and bacterial pathogen recognition), maternal stress response (serotonin), and other pathways that have been implicated as potential drivers of health disparities (vascular responsivity). At this point in the study, we have genotyped nearly every candidate gene we proposed in the application.

In addition to our candidate gene analyses, this past year, we gave considerable thought to the issue of population stratification. Our baseline approach to analysis has always been to consider the non-Hispanic black (NHB) and non-Hispanic white (NHW) women in separate analyses. The bulk of our sample is comprised of NHB women and it is expected that even within that group there is variability in genetic make-up. To address this, we have generated the Illumina African American Admixture Chip on 824 NHB women. This admixture chip contains 1,509 SNPs which were specifically selected due to the disparate frequencies in the Yoruban (African) and Caucasian HapMap samples, the two primary ancestral populations of NHB women. These data are currently being used in two ways. The first approach is to use these data to cluster the NHB women into sub-populations. Membership assignment to sub-populations will be used as a covariate in our future candidate gene studies as a means to further protect against population stratification. The second approach will be to exploit these data to identify regions of the genome that may be over-represented by one of the ancestral populations and also associated with the occurrence of our fetal and maternal outcomes. This admixture mapping approach has been used successfully to map genes for obesity, renal disease, and multiple sclerosis, among others. We view the admixture mapping approach as a hypothesis generating exercise to point us to regions of the genome that otherwise we may not have examined. In this regard, it is very complementary to the candidate gene approach that we have been taking whereby we make hypotheses concerning the involvement of specific candidate genes a priori given what we know about the biology of our fetal and maternal outcomes. It is important to note that we have been judicious with our budget and resources in order to afford the Admixture Chips. It was not part of the original proposal, but we felt that it would add significantly to the quality of our analyses and the flexibility of hypotheses examined.

In the coming year, we expect to continue our genotyping efforts. We already are scheduled to run approximately 180 more samples for the Admixture Chip. Our candidate gene genotyping will include both generation of genotypes in recently collected samples for the existing candidate genes, and also prioritizing new candidate genes involved in contaminant metabolism and stress-response.

Statistical analysis regarding candidate gene polymorphisms began in June 2008 and is ongoing. Preliminary genetic analyses are described below.

In our progress report last year, we detailed results from our analyses of the Vitamin D receptor gene (VDR) gene and infant birth weight. We are pleased to report that this work has been accepted for publication in the American Journal of Medical Genetics.

We also have examined polymorphisms in the G-protein coupled receptor kinase 5 (GRK-5) gene. GRK5 is associated with a pharmacogenomic interaction among African Americans in the setting of cardiovascular disease and response to β-adrenergic receptor (βAR) blockade, which is standard therapy for cardiac failure and ischemia. Because of the association with cardiovascular disease, we hypothesized that GRK-5 genetic variation was associated with hypertensive disorders in pregnancy. We defined hypertensive disorders as chronic hypertension (CHTN = BP > 140/90 before 20 wks), preeclampsia (BP > 140/90 and proteinuria), and CHTN + superimposed preeclampsia (CHTN with new onset or worsening proteinuria). Haplotype tagging single nucleotide polymorphisms (SNPs) were genotyped for GRK-5 via Taqman assays. Logistic regression was used to examine the relationship between maternal genotype and each hypertensive disorder among the NHB women, adjusting for age, education, insurance, tobacco use, and pre-pregnancy BMI. CHTN was included as a covariate in the model for preeclampsia. In our NHB data set, 125 out of 587 participants (21%) were diagnosed with preeclampsia. Of the 17 SNPs examined, 3 were nominally associated with preeclampsia. For the most significant association with rs10886445 (global p = 0.0009), the odds of preeclampsia for NHB women with the CC genotype were 0.28 times that for NHB women with the TT genotype (CI: 0.1429, 0.552). For those NHB women with the CT genotype, the odds of developing preeclampsia were 0.33 times that for NHB women with the TT genotype (CI: 0.1682, 0.656). In addition, rs12416565 (global p = 0.003) and rs11198925 (global p = 0.02) were also nominally associated. For CHTN, only one marker (rs2420620, global p = 0.02) demonstrated nominal association. Similarly, for CHTN+preeclampsia, only one marker (rs10510055, global p = 0.02) demonstrated nominal association. Based on these results, we concluded that the GRK-5 gene may play a role in hypertensive disorders of pregnancy, particularly the development of preeclampsia. Future analyses will examine the effects of GRK-5 on blood pressure regulation (see below our work on defining blood pressure trajectories in our data set) and potential pharmacogenomic interactions during pregnancy. These data were presented earlier this year at the Society for Maternal and Fetal Medicine and are currently being written up as a manuscript to be submitted in the next few months.

We also have begun to examine gene by environment interactions in our data set. We have findings of several gene*environment interactions (G*E) involving exposure to environmental tobacco smoke (ETS) as measured by cadmium and cotinine in the blood of our mothers. In addition, we have begun to focus more acutely on air pollution in these gene*environment interactions using proximity to roadways and road density data.

In our NHB women, we have examined G*E between genes in the inflammatory pathway and ETS as they relate to infant birthweight and identified several nominal associations, the most significant being rs2069771 in the interleukin-2 gene with cadmium exposure (global p = 0.005) and rs9005 in the interleukin 1 receptor antagonist with cadmium exposure (global p = 0.006). In addition, also among the NHB women, we have identified G*E interactions between the n-acetyltransferase genes and cadmium exposure predicting maternal preeclampsia and infant outcomes. In particular, rs8190845 in NAT1 interacted with cadmium exposure to predict occurrence of preeclampsia in the mother (global p = 0.009). Additionally, rs17126345, also in NAT1, interacted with cotinine exposure to predict the occurrence of preterm birth as defined as delivery prior to 37 weeks gestation (global p = 0.006). The analyses of G*E with the inflammatory genes and G*E with the n-acetyltransferase genes are both currently being written for publication. Moreover, these results will also be submitted in abstract form to the annual meeting of the American Society of Human Genetics.

Statistical Methods Development. The project team continued to develop new ways of handling missing data in large epidemiological studies in which interaction effects are suspected. The main approach is to adapt regression trees to perform multiple imputation. This approach is being used to handle the missing data in the prospective study of Project R833293002. This methodology has the potential to be utilized in a wide range of settings, including outside of epidemiological contexts. An article describing this work has been accepted for publication in the American Journal of Epidemiology.

The team examined approaches to performing Bayesian analysis after multiple imputation is used for missing data. This work is motivated by the use of the tree methodology for multiple imputation, because we are estimating Bayesian models with the completed datasets (see the paragraph on Bayesian quantile regression with latent factors below). An article describing this research was published in The American Statistician.

The team developed methods for exploring sets of important predictors in large epidemiological studies when quantile regression will be used for the outcome variable. These methods adapt penalties from ordinary least squares lasso regression and elastic net regression so that they enable quantile regression. The team is using this methodology to explore the most important predictors of adverse birth outcomes in the prospective study of Project R833293002. A manuscript describing this work is in preparation and will be submitted in fall 2010.

The team developed an approach for performing Bayesian quantile regression with latent factors. The motivation for this development is as follows. Many of the predictors of adverse birth outcomes do not strongly predict adverse birth outcomes, likely because of the modest sample size for the strength of associations seen in the data. However, many of the predictors can be conceptualized as indicators of underlying factors that could be strong predictors; for example, several of the psychosocial variables can be grouped as a factor indicating the amount of social support available to the mother. We developed and are applying methodology for estimating the effects of these factors on birth outcomes using the prospective study of Project R833293002. A manuscript describing this work is in preparation and will be submitted in fall 2010.

Finally, the team developed an approach for assessing sensitivity to unmeasured confounding when using principal stratification. This work was motivated by the presence of several intermediate variables in the prospective study of Project R833293002, e.g., hypertension as an intermediate variable for gestation age. At this point, this work is at a theoretical stage; we have not yet applied it on Project R833293002 data. A manuscript on the theory has been submitted to a peer-reviewed journal.

We wish to examine whether and to what extent environmental exposures are associated differentially between medically indicated and not-medically indicated (spontaneous) preterm births. However, since information on the clinical subtypes of preterm birth is not available from the North Carolina Detailed Birth Record database, we will first develop a prediction model for spontaneous preterm births using data from the prospective study of Research Project R833293002. By treating the indicator of whether a preterm birth was spontaneous as missing, we plan to generate multiple complete datasets for geo-coded preterm births in Research Project R833293001. Statistical analyses will then be conducted under a multiple imputation framework. Moreover, because the recruited participants are not representative of the state-wide birth cohort due to different spatial and temporal domains, we will also explore inference for multiple imputed datasets when the records used for imputation are not used for analysis.

Psychosocial Indicators. Analyses have been completed on psychosocial influences on birth outcomes. The relationships among pregnancy intention, psychosocial health, and pregnancy outcomes have been examined, with a draft paper ready to submit early in year 4. This work was presented at the American Public Health Association annual meeting. In addition, we are examining pregnancy intention, behavioral choice, and environmental exposures. This work will be presented at the Pediatric Academic Society meeting in May, 2010 (year 4). The influences of psychosocial health and smoking status have been studied, and a draft paper will be ready to submit early in year 4. In order to reduce the number of psychosocial variables, cluster analysis has been performed, resulting in three distinct clusters of women. These clusters are being examined in relation to other domains, such as genetics, personality, pregnancy outcomes. This work will be presented at the Society for Epidemiologic Research in June, 2010 (year 4).

Maternal Medical Complications. Fetal health is not only individually determined, but is also influenced by maternal health and well-being. This past year, we have begun to examine maternal outcomes as well. In particular, we have begun to focus on maternal hypertensive conditions. As a first step, we are trying to identify factors that affect maternal blood pressure during pregnancy. In order to make use of the entirety of blood pressure readings collected across the pregnancy, we are considering a variety of statistical approaches. To address this question, we developed a Bayesian finite mixture model to jointly examine the associations between longitudinal blood pressure trajectories, PTB, and LBW. The model partitions women into distinct groups characterized by an average mean arterial pressure (MAP) trajectory, a probability of PTB, and a probability of LBW. Our approach also introduces a correlated probit model within each cluster to capture residual correlation between PTB and LBW. We recently completed the data analysis, and plan to submit the results to a statistical journal during Year 4. Our ultimate goal is to use environmental, social, and genetic data (such as GRK5 polymorphisms) to predict these blood pressure trajectories. We hope these predicted trajectories will aid us in predicting birth outcomes; for example, women with monotonically-increasing blood pressure trajectories may exhibit poorer birth outcomes than women with U-shaped curves.

Environmental Sampling. Using the maternal environmental blood samples collected on all participants in Project R833293002, we have been working to characterize maternal exposures to toxics. In addition to documenting the blood lead burdens among a cohort of pregnant women in Durham County, NC, we have been able to characterize current maternal exposures to lead by linking each participant to the tax parcel at which they resided during their pregnancy. We found that both year built and modeled lead exposure risk at participant’s residence during pregnancy were not predictive of maternal blood lead levels. Taken in combination with results showing that maternal blood levels increased with age and parity, these findings indicate that maternal blood lead levels are much more likely the result of lead remobilization from historic exposures as opposed to contemporaneous exposures. A manuscript on this work has been published in the International Journal of Environmental Research and Public Health.

Residential Mobility. With our access to the North Carolina Detailed Birth Record (DBR) in Project R833293001, we have been able to link participants in Project R833293002 with their birth certificate data. Using maternal and infant identifying information, including name, place, and date of birth, we have been able to link 991 (99.9%) of participants who completed the study and had a live birth by December 31, 2008 and 59 (76.6%) of participants who were lost-to-follow-up but with an expected delivery date on or before December 31, 2008. This linkage will allow us to determine who is moving during pregnancy (by comparing the address at enrollment and the DBR address at delivery) and the nature of those moves, including the quality of the new location compared to the previous location (and thus changes in environment or exposure).

Roadways. In parallel to the Project R833293001 work with road proximity metrics, we geocoded Project R833293002 participants to the tax parcel level and then calculated the distance to the nearest roadways. In the coming year, we are planning to run analysis that extends the road proximity work in Project R833293001 by incorporating the rich set of variables available in Project R833293002, including analysis looking at how psychosocial health and gene-by-environment interactions may influence the impact of traffic-related air pollution on birth outcomes.

Community Assessment Project/Built Environment. The Community Assessment Project (CAP) assessed built environment variables for over 17,000 tax parcels, including the home addresses of over 40% of the participants in the Healthy Pregnancy, Healthy Baby Study (SCEDDBO Project R833293002). Seven scales (housing damage, property disorder, security measures, tenure, vacancy, violent crime and nuisances) have been constructed at five levels of geography (census block, primary adjacency neighborhood, census block group, census tract and city-defined neighborhoods). Analyses have begun assessing the relationship between the built environment and maternal psychosocial status. A paper assessing the built environment and maternal pregnancy intention among the clinical ob cohort (Project R833293002) is well-underway.

Collaborations with other SCEDDBO Components

The collaborative efforts this year have increased significantly. The entire SCEDDBO team has prioritized air pollution as one of the primary environmental contaminants to be examined across projects. This has involved significant discussions between members of Project R833293002 with members in Project R833293001 to construct viable markers of air pollution, including proximity to primary and secondary roadways, and NATA data. Project R833293002 also prioritized the interleukin/inflammatory genes for analysis after consultation with Project R833293003 so that we could support more biological synergies across the two projects. Similarly, Project R833293003 introduced a nest-deprivation model into the ongoing animal experiments in an attempt to better replicate the more complex psycho-social stressors experienced by the mothers in Project R833293002. Linking the North Carolina Detailed Birth Record (DBR) in Project R833293001 with participants in Project R833293002 will enable us to pursue multiple collaborations between Projects R833293001 and R833293002, such as residential mobility and maternal medical complications. The Community Assessment Project/Built Environment data reaches across all three projects. The built environment indices will be used across Projects R833293001 and R833293002 and will inform Project R833293003 regarding its deprivation model, and finally, the statistical team for the GISSA has worked hard to develop more innovative statistical approaches to disentangling the complex web of interactions that are driving the birth outcomes. These innovations have been motivated by specific questions across all three projects.

Future Activities:

In the upcoming year, we will continue to enroll study participants with our new target sample size of 1800 pregnant women. We will continue analyses on approximately 1250 participants with complete pregnancy data, genetic results, and environmental results already in hand.  Analyses will look at the joint impact of environmental, social, and host factors on birth outcomes, especially as they differ by and within race. Identification of such co-exposures could lead to development and implementation of strategies to prevent adverse birth outcomes, ultimately decreasing or eliminating the racial disparity. 
Statistical Methods Development. In the upcoming year, we plan to refine the Bayesian quantile regression with latent factors model, and extend it to include more data on host factors, in particular genetics data. We also will develop other flexible models for predicting birth outcomes besides quantile regression, including Bayesian density regressions. Finally, we will apply techniques that we have already developed for handling missing data and performing exploratory quantile regression to the augmented dataset (i.e., the additional births who enter the study), and develop new techniques as necessitated by future data collection.
Psychosocial Indicators. In year 4, we plan to incorporate the cluster analyses created from the psychosocial health variables into genetic and environmental analyses. In addition, we plan on examining the relationship between psychosocial health and the built environment.
Community Assessment Project/Built Environment. Imminent analyses will consider the broader suite of maternal psychosocial indicators, including stress, depression and anxiety. These analyses will be extended to include the psychosocial clusters previously developed and also how the intersection of built environment features and psychosocial status appears associated with maternal health behaviors. 
Environmental Sampling. In Year 4, we plan to pursue work similar to the lead analysis characterizing exposures to mercury and cotinine, including how maternal education and diet affects mercury levels as well as how cotinine levels are related to self-reported tobacco use and exposure to environmental tobacco smoke.
Roads. As indicated above, we are planning to run analysis that extends the road proximity work in Project R833293001 by incorporating the rich set of variables available in Project R833293002, including analysis looking at how psychosocial health influences and gene-by-environment interactions may influence the impact of traffic-related air pollution on birth outcomes. 

Journal Articles on this Report : 3 Displayed | Download in RIS Format

Other subproject views: All 51 publications 26 publications in selected types All 26 journal articles
Other center views: All 162 publications 76 publications in selected types All 75 journal articles
Type Citation Sub Project Document Sources
Journal Article Burgette LF, Reiter JP. Multiple imputation for missing data via sequential regression trees. American Journal of Epidemiology 2010;172(9):1070-1076. R833293 (2008)
R833293 (2009)
R833293 (2010)
R833293 (Final)
R833293C002 (2009)
R833293C002 (2010)
R833293C002 (Final)
  • Abstract from PubMed
  • Full-text: Oxford Journals-Full Text HTML
  • Abstract: Oxford Journals-Abstract
  • Other: Oxford Journals-Full Text PDF
  • Journal Article Miranda ML, Edwards SE, Swamy GK, Paul CJ, Neelon B. Blood lead levels among pregnant women: historical versus contemporaneous exposures. International Journal of Environmental Research and Public Health 2010;7(4):1508-1519. R833293 (2008)
    R833293 (2009)
    R833293 (2010)
    R833293 (Final)
    R833293C002 (2009)
    R833293C002 (2010)
    R833293C002 (Final)
  • Full-text from PubMed
  • Abstract from PubMed
  • Associated PubMed link
  • Full-text: MDPI Publishing-Full Text PDF
  • Abstract: MDPI Publishing-Abstract
  • Journal Article Zhou X, Reiter JP. A note on Bayesian inference after multiple imputation. The American Statistician 2010;64(2):159-163. R833293 (2008)
    R833293 (2009)
    R833293 (Final)
    R833293C002 (2009)
    R833293C002 (Final)
  • Full-text: Duke University-Full Text PDF
  • Abstract: Taylor & Francis-Abstract
  • Supplemental Keywords:

    Pregnancy, preterm birth, low birth weight, racial disparity, African American, environmental stressors, gene-environment interactions, psychosocial stressors, genes, single nucleotide polymorphisms, genetic admixture,

    Progress and Final Reports:

    Original Abstract
  • 2007
  • 2008
  • 2010 Progress Report
  • 2011 Progress Report
  • 2012
  • Final Report

  • Main Center Abstract and Reports:

    R833293    Southern Center on Environmentally Driven Disparities in Birth Outcomes

    Subprojects under this Center: (EPA does not fund or establish subprojects; EPA awards and manages the overall grant for this center).
    R833293C001 Research Project A: Mapping Disparities in Birth Outcomes
    R833293C002 Research Project B: Healthy Pregnancy, Healthy Baby: Studying Racial Disparities in Birth Outcomes
    R833293C003 Research Project C: Perinatal Environmental Exposure Disparity and Neonatal Respiratory Health
    R833293C004 Community Outreach and Translation Core
    R833293C005 Geographic Information System and Statistical Analysis Core