Application and validation of a line-source dispersion model to estimate small scale traffic-related particulate matter concentrations across the conterminous US

Application and validation of a line-source dispersion model to estimate small scale... Numerous studies document increased health risks from exposure to traffic and traffic-related particulate matter (PM). However, many studies use simple exposure metrics to represent traffic-related PM, and/or are limited to small geographic areas over relatively short (e.g., 1 year) time periods. We developed a modeling approach for the conterminous US from 1999 to 2011 that applies a line-source Gaussian plume dispersion model using several spatially and/or temporally varying inputs (including daily meteorology) to produce high spatial resolution estimates of primary near-road traffic-related PM levels. We compared two methods of spatially averaging traffic counts: spatial smoothing generalized additive models and kernel density. Also, we evaluated and validated the output from the line-source dispersion modeling approach in a spatio-temporal model of 24-h average PM < 2.5 μm(PM ) elemental carbon (EC) levels. We found that spatial smoothing of traffic count point data performed better 2.5 than a kernel density approach. Predictive accuracy of the spatio-temporal model of PM EC levels was moderate for 24-h 2.5 2 2 averages (cross-validation (CV) R = 0.532) and higher for longer averaging times (CV R = 0.707 and 0.795 for monthly and annual averages, respectively). PM EC levels increased monotonically with line-source dispersion model output. Predictive 2.5 accuracy was higher when the spatio-temporal model of PM EC included line-source dispersion model output compared to 2.5 distance to road terms. Our approach provides estimates of primary traffic-related PM levels with high spatial resolution across the conterminous US from 1999 to 2011. Spatio-temporal model predictions describe 24-h average PM EC levels at unmea- 2.5 sured locations well, especially over longer averaging times. . . . . Keywords Air pollution Spatial smoothing Highway proximity Traffic counts Dispersion models Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11869-018-0580-6) contains supplementary material, which is available to authorized users. * Jeff D. Yanosky Robin C. Puett jyanosky@phs.psu.edu rpuett@umd.edu Jared Fisher Department of Public Health Sciences, The Pennsylvania State jafisher@umd.edu University College of Medicine, Hershey, PA, USA Department of Epidemiology and Biostatistics, University of Duanping Liao Maryland School of Public Health, College Park, MD, USA dliao@phs.psu.edu Department of Architectural Engineering, The Pennsylvania State Donghyun Rim University College of Engineering, State College, PA, USA drim@engr.psu.edu John and Willie Leone Family Department of Energy and Mineral Engineering & The EMS Energy Institute, The Pennsylvania State Randy Vander Wal University College of Earth and Mineral Sciences, State College, PA, ruv12@psu.edu USA William Groves Maryland Institute of Applied Environmental Health, University of wag10@psu.edu Maryland School of Public Health, College Park, MD, USA 742 Air Qual Atmos Health (2018) 11:741–754 Background Gaussian plume dispersion modeling approach that estimates daily 24-h average traffic-related PM concentrations near Recent epidemiologic studies have documented increased roadways which can be applied at any location in the conter- health risks from exposure to atmospheric particulate matter minous US from 1999 to 2011. Also, we developed and val- (PM) (Anderson et al. 2012; Beelen et al. 2014;Pelucchi etal. idated a spatio-temporal model of PM elemental carbon 2.5 2009;Shah etal. 2013;Hartet al. 2015; Heinrich et al. 2013; (EC) using 24-h average measurements of this component of Weuve et al. 2012; Stieb et al. 2012;Rich etal. 2018; Golan PM ,PM EC, together with several meteorological and 2.5 2.5 et al. 2018) for a variety of health outcomes, including non- GIS-based spatial covariates (including traffic-related PM accidental mortality, cardiovascular disease mortality, lung levels from our line-source dispersion modeling approach). cancer, neurocognitive functioning, and effects on reproduc- These models will be used to provide estimates of exposure tion. Studies using proxy measures for exposure to traffic- in epidemiologic analyses. We chose to present results for related air pollution have generally reported health effects PM EC because it has both traffic-related local sources 2.5 larger than those for the mass concentration of PM < 2.5 μm and larger-scale regional gradients (Brochu et al. 2011)and (PM )orPM <10 μm(PM ) in aerodynamic diameter also had high predictive accuracy (spatial cross-validation 2.5 10 (Brugge et al. 2013; Gehring et al. 2006; Hoffmann et al. (CV) R = 0.784 (described later)); PM components Zn and 2007; Puett et al. 2011; Puett et al. 2014;Schikowski et al. Cu were also evaluated but had lower predictive accuracy 2005). Several of these studies used simple exposure metrics (spatial CV R = 0.691 and 0.501, respectively). We chose for traffic-related air pollution based on distance to nearest not to use NO /NOx as our exposure metric due to our interest road, or summaries of road geography within buffers of in PM health effects and also to the complexity of NO-NO -O 2 3 fixed radii (Eckel et al. 2011;Medina-Ramonetal. 2008). chemistry in areas very near roadways, though we may address More complex approaches have been applied within small this in future work. Our spatio-temporal model predicts 24-h geographic areas, typically to one urban area or municipal- average PM EC levels across the conterminous US with high 2.5 ity (Gryparis et al. 2007; Maynard et al. 2007). One such spatial resolution from 1999 to 2011, and thus can provide approach accounted for wind direction using a geographic acute (24-h), sub-acute, or chronic exposure estimates that ac- information system (GIS) to calculate the traffic density in count for traffic intensity, prevailing wind direction, and dis- locations upwind of receptors (Arain et al. 2007). Traffic- persion characteristics based on local meteorology. related air pollutant levels can vary on small (i.e., micro (0– 100 m), middle (100–500 m), and neighborhood (500– 4 km)) spatial scales immediately near roadways (Jerrett Methods et al. 2005; Zhou and Levy 2007). Though the above com- plex approaches likely described local gradients in traffic- To approximate micro-, middle-, and neighborhood-scale pri- related air pollutants better than simpler methods such as mary traffic-related PM levels, we applied a line-source distance to road terms, few direct, robust, and long-term Gaussian plume dispersion model, Atmospheric Dispersion multi-city comparisons are available. Uncertainty remains Modeling System (ADMS)-Roads v2.3 software (CERC, about whether and to what extent noise may be a confound- Cambridge, UK), to a hypothetical grid (hereafter referred to er in analyses of health effects of traffic-related PM expo- as a kernel) with a 2-m cell size under varying meteorological sure, which underscores the need for accurate and highly regimes. We calculated traffic-related PM levels at each grid spatially resolved estimation of traffic-related PM levels so point in the kernel for each combination of ADMS-Roads these effects can be better disentangled in future epidemio- inputs, and then mapped receptor locations onto the appropri- logic analyses. ate kernel, with rotation for wind direction if appropriate, to Gaussian plume dispersion models have been widely used estimate traffic-related PM levels. We chose ADMS-Roads as to model point sources (Jerrett et al. 2005) and have been opposed to other line-source models such as CALRoads-View adapted to area and line sources (Benson 1992). Wilton et al. because of ADMS-Roads’ more advanced treatment of atmo- (2010) used such a model, CALRoads-View, to describe small spheric stability as characterized by continuous Monin- scale gradients in nitrogen oxide (NO ) and nitrogen dioxide Obhukov length (L ) and planetary boundary layer height x MO (NO )levelsfor a2-weekperiodin2006ingreater Los (inputs which are directly available from the Modern Era Angeles and for a 2-week period in 2005 in Seattle. They Retrospective-analysis for Research and Applications found that including the output from CALRoads-View im- (MERRA; http://disc.sci.gsfc.nasa.gov/daac-bin/FTPSubset. proved the prediction capability of city-specific spatio-tempo- pl; see below), resulting in a continuous and non- ral models of NOx and NO levels compared to traditional overlapping metric of atmospheric stability, as opposed to land-use regression models. conventional stability classes. Two hypothetical 5-m road seg- The objective of the present study was to describe the de- ments, facing each other to eliminate vehicle direction effects, velopment of and evaluate the performance of a line-source were placed at the origin of each kernel. The domain of each Air Qual Atmos Health (2018) 11:741–754 743 kernel was 1 km wide by 2.05 km long (2 km downwind of daily models were fit separately for each of seven geo- source and 0.05 km upwind). The ADMS-Roads modeling graphic regions (with an overlap of 400 km with adjacent approach requires several spatially and/or temporally varying regions) of the conterminous US (Supplemental Material inputs in order to produce estimates of near-road traffic-related Fig. S3), yielding seven spatial smoothing GAMs per day PM concentrations (referred to as ADMS-Roads output): (1) to be used for space-time prediction of each daily meteoro- traffic intensity for each of four road classes (A1, A2, A3, and logical parameter at any location in the domain of the con- A4/A6), (2) daily prevailing wind direction, and (3) several terminous US. The distribution of each of these meteoro- parameters which affect pollutant dispersion: wind speed, sur- logical parameters was divided into several categories, face roughness, sensible heat flux, planetary boundary layer listed in Table 1, with details of this classification provided height, road width, vehicle speed, and absolute value of min- in the Supplemental Material. imum L . An example kernel is shown in Supplemental Location-specific surface roughness and absolute value MO Material Fig. S1. The derivation and estimation of these spa- of minimum L values were derived, in accordance with MO tially varying or spatio-temporally varying input parameters is guidance in the ADMS-Roads software, based on categori- described in the following sections. Later we discuss the eval- zation of urban land use data from the 2001 US Geological uation and validation of ADMS-Roads output using spatio- Survey (USGS) National Land Cover Dataset (USGS 2012; temporal models of 24-h average PM elemental carbon Table 1).Theoriginal30-mcellsizedataweresummarized 2.5 (EC) levels. in ArcGIS v.10 (Environmental Systems Research Institute, Redlands, CA) using a moving window with a 1-km radius Availability of data and material Restrictions apply to the to calculate the percentage of low, medium, and high- availability of the data used and analyzed in this study, some intensity developed land use. Categories of these values of which were used under license for the current study and so are shown in Table 1. are not publicly available. Because we found it had no substantive effect on ADMS- Roads output in the range applicable over the conterminous US, air temperature was assumed to equal 15° C for all ADMS-Roads output. Meteorological inputs to ADMS-Roads Traffic intensity inputs to ADMS-Roads Meteorological data on wind speed and direction (as u- and v- vector wind components at ~10 m above ground), sensi- We obtained data on annual-average daily traffic counts (here- ble heat flux, and planetary boundary layer height were after referred to as traffic counts) from Geographic Data obtained from the MERRA project. These data are available Technology, Inc. (Lebanon, NH) Dynamap Traffic Counts hourly on a grid across the continental US with an approx- v4.2. We first spatially joined the traffic count points (those imately 55-km (depending on latitude) cell size; the native that reported in or after 1990) to the ESRI StreetMap Pro 2007 grid is 0.5° latitude/longitude. Hourly gridded data were network of road segments to obtain the US Census Feature assigned local time and date based on the time zone of each Class Code (US Census Bureau 2013) road class (hereafter grid point’s location and then averaged by day (for wind referred to as road class) of each road segment. Details regard- direction, we averaged u- and v- vector components sepa- ing the processing of these data can be found in the rately, then calculated the direction of the resulting vector) Supplemental Material. From the processed traffic intensity from 1999 to 2011. To allow spatial prediction between the data, we created four spatial smoothing GAMs: one for each 55-km grid points, the daily averages of each parameter road class (A1, A2, A3, and A4/A6; results shown in were used to create spatial smoothing generalized additive Supplemental Material Fig. S4 panels A1b, A2b, A3b, and models (GAMs) of the form: A4b, respectively). These spatial smoothing GAMs were of the same form as in Eq. 1 and were validated by calculating y ¼ α þ gsðÞþ ε ; ε ∼N 0; σ ð1Þ t i it it t it the squared Pearson correlation between predicted road-class- specific traffic intensities and those measured in an external where y is the value of the meteorological parameter at validation data set from the US Bureau of Transportation it location i =1…I on day t =1…T, α is an intercept Statistics (BTS) 2011 National Transportation Atlas representing the adjusted mean on a given day, g(s)isa Database Automated Traffic Recorders (ATR; US BTS two-dimensional penalized thin-plate smoothing spline, 2013) and Weigh in Motion (WIM) (US BTS 2013)data sets. with basis dimension k = I ∗ 0.9 chosen to allow much flex- Details regarding the development of the spatial smoothing ibility in the function. To reduce the potential for over- GAMs of traffic intensity and the processing of the external fitting, a multiplier of 1.4 (using the gamma argument to traffic count validation data can be found in the Supplemental gam()), as recommended by Wood (2006), was used. These Material. For comparison purposes, we also calculated the 744 Air Qual Atmos Health (2018) 11:741–754 Table 1 Categories of the meteorological, traffic intensity, and other input parameters to ADMS-Roads to produce the kernels Category (midpoint Group/parameter [minimum, maximum)) Meteorological inputs to ADMS-Roads Traffic intensity inputs to ADMS-Roads Wind speed Sensible heat Planetary boundary Urbanized land Surface Absolute value Average hourly Average hourly Average hourly Average hourly −1 −2 (m s ) flux (W m ) layer height (m) use within 1 km roughness of minimum traffic count for traffic count for traffic count for traffic count for a b b,c d d d d (%) (m) L (m) A1 roads A2 roads A3 roads A4 roads MO 1 0.75 [0, 1.1) − 35 (≤ 10) 70 (< 140) [0–5) 0.2 10 175 [0, 350) 75 [0, 150) 38 [0, 75) 15 [0, 30) 2 1.50 [1.1, 1.9) 15 [− 10, 40) 210 [140, 280) [5–15) 0.3 10 525 [350, 700) 225 [150, 300) 113 [75, 150) 45 [30, 60) 3 2.25 [1.9, 2.6) 65 [40, 90) 350 [280, 420) [15–30) 0.4 10 875 [700, 1050) 375 [300, 450) 188 [150, 225) 75 [60, 90) 4 3.00 [2.6, 3.4) 115 [90, 140) 490 [420, 560) [30–55] 0.5 30 1225 [1050,1400) 525 [450,600) 263 [225, 300) 105 [90, 120) 5 3.75 [3.4, 4.1) 165 [140, 190) 630 [560, 700) [55–90) 1 30 1575 [1400, 1750) 675 [600, 750) 338 [300, 375) 135 [120, 150) 6 4.50 [4.1, 4.9) 215 [190, 240) 770 [700, 840) [90–100] 1 100 1925 [1750, 2100) 825 [750, 900) 413 [375, 450) 165 [150, 180) 7 5.25 [4.9, 5.6) 265 [240, 290) 910 [840, 980) 2275 [2100, 2450) 975 [900, 1050) 488 [450, 525) 195 [180, 210) 8 6.00 [5.6, 6.4) 315 (≥ 290) 1050 [980, 1120) 2625 [2450, 2800) 1125 [1050, 1200) 563 [525, 600) 225 [210, 240) 9 6.75 [6.4, 7.1) 1190 [1120, 1260) 2975 [2800, 3150) 1275 [1200, 1350) 638 [600, 675) 255 [240, 270) 10 7.50 [7.1, 7.9) 1330 [1260, 1400) 3325 [3150, 3500) 1425 [1350, 1500) 713 [675, 750) 285 [270, 300) 11 8.25 [7.9, 8.6) 1470 [1400, 1540) 3675 [3500, 3850) 1575 [1500, 1650) 788 [750, 825) 315 (≥ 300) 12 9.00 [8.6, 9.4) 1610 [1540, 1680) 4025 [3850, 4200) 1725 [1650, 1800) 863 (≥ 825) 13 9.75 [9.4, 10.1) 1750 [1680, 1820) 4375 [4200, 4550) 1875 [1800, 1950) 14 10.50 [10.1, 10.9) 1890 (≥ 1820) 4725 [4550, 4900) 2025 [1950, 2100) 15 11.25 [10.9, 11.6) 5075 [4900, 5250) 2175 (≥ 2100) 16 12.00 [11.6, 12.4) 5425 [5250, 5600) 17 12.75 [12.4, 13.1) 5775 (≥ 5600) 18 13.50 [13.1, 13.9) 19 14.25 (≥ 13.9) From the 2001 US Geological Survey National Land Cover Dataset; defined as the percentage of developed low, medium, and high intensity land use within 1km Values assigned based on categories in the ADMS-Roads software and its documentation using urbanized land use category L refers to the Monin-Obhukov length MO d −1 From spatial smoothing GAMs, transformed into units of vehicles hour . Road widths were assumed to vary based on road class, with values of 22, 14.7, 11, and 7.3 m for road classes A1, A2, A3, and −1 both A4 and A6, respectively. Also, vehicle speeds of 90, 70, 55, and 40 km h , respectively, were assumed Air Qual Atmos Health (2018) 11:741–754 745 kernel density of the traffic count data using a neighborhood GAMs, the ADMS-Roads output concentrations were inter- of 100 km and the ArcGIS BKernel Density^ tool (Silverman polated from the appropriate kernel, for both Bbest-estimate^ 1986). We compared these values to the combined ATR and levels which depended on prevailing wind direction and for WIM data by road class using linear regression models. Bworst-case downwind^ levels ignoring wind direction. A de- Because detailed geographic data on road widths and vehi- tailed description of the steps is provided in the Supplemental cle speeds across the conterminous US were not available, Material. representative road width and vehicle speed values were as- sumed to be constant based on road class (Table 1). Other meteorological and GIS-based spatial To provide road source location data, StreetMap Pro 2007 covariates street line segments were converted to a series of points spaced 10 m apart using the Xtools Pro tool BConvert Data on meteorological parameters air temperature, total pre- Features to Points^ (Data East, Russia; examples in Fig. 1 cipitation, and total snowfall were obtained from MERRA and and Supplemental Material Fig. S2). The emissions factors processed as described above. Data on daily total rainfall were we used in ADMS-Roads can be found in the Supplemental derived by subtraction of total snowfall flux from total precip- Material. We note that our use of a smooth regression spline in itation flux for each day. Though these parameters were not the spatio-temporal model of 24-h average PM EC (hereaf- used as inputs to ADMS-Roads, daily spatially smoothed me- 2.5 ter referred to as Bspatio-temporal PM EC model^)de- teorological values were evaluated as spatio-temporal covari- 2.5 scribed below provides an effective scaling of this emission ates in spatio-temporal models of 24-h average PM EC. 2.5 factor to measured PM EC levels. Thus, our modeling pro- Details on other spatial covariates including county-level pop- 2.5 cess eliminates discrepancies between European and US ve- ulation density, elevation (USGS 2013), and distance to road hicle fleets and between assumed and actual emission rates. covariates can be found in the Supplemental Material. Obtaining ADMS-Roads output Validation of ADMS-Roads output using spatio-temporal models of 24-h average PM EC 2.5 A separate kernel containing ADMS-Roads output was gen- erated for each combination of meteorological and traffic in- Statistical models tensity categories (Table 1; details in Supporting Material). Briefly, to obtain ADMS-Roads output, we first selected the Spatio-temporal generalized additive mixed models kernel that best matched the dispersion and traffic intensity (GAMMs) of 24-h average PM EC levels were evaluated 2.5 conditions, then identified road source points within 2 km of and a final prediction model developed with the joint aims of each prediction location. Together with meteorological and (1) validating and calibrating the ADMS-Roads output to a traffic intensity inputs predicted from spatial smoothing traffic-related PM component, and (2) predicting 24-h 2.5 Fig. 1 Roadways displayed by US Census feature class code and 10-m street points nearby to a PM speciation monitor near 2.5 Linden, New Jersey 746 Air Qual Atmos Health (2018) 11:741–754 average PM EC levels at unmeasured locations anywhere in spatial terms assumed residual spatial variation was station- 2.5 the conterminous US. There were 219,300 24-h site-average arity and isotropic across the domain of the conterminous US. PM EC measurements from 1999 to 2011 at 361 unique The first-stage model equation was: 2.5 monitoring locations. These measurements were available y ¼ u þ α þ ∑ f Z þ gðÞ s þ e ; i t i;t;p i i;t i;t p p t from the Interagency Monitoring of Protected Visual ð4Þ e ∼N 0; σ i;t et Environments (IMPROVE) network (parameter code 88321; geographic coordinates for two sites were modified, see and was fit iteratively in a back-fitting arrangement with u + Supplemental Material for details) and the US Environmental α + ∑ f (Z ) estimated jointly and g (s ) estimated sepa- t p p i, t, p t i Protection Agency’s Chemical Speciation Network (parameter −1 rately by day (on days where sufficient (> 12 values day ) code 88380) on every-third day after August 2001, and data were available). Variability in the measured concentra- alternating every-third and fourth day from January 1, 1999, tions is parsed between the covariates and the residual spatial to that date. 89.7% of available measurements were from the terms. For the spatial models in the first stage, a multiplier of IMPROVE network. PM EC measurements were approxi- 2.5 1.4 (using the gamma argument to gam()) was used to avoid mately log-normally distributed, and had geometric mean and over-fitting (Wood 2006). All individual fits within the back- −3 geometric standard deviation of 154 ng m and 10, respec- fitting were preformed using the gam()function in the mgcv tively (PM EC measurements ranged from 9.4 to 2.5 package (Wood 2006)ofR(R DevelopmentCoreTeam −3 10,398.8 ng m ). A map showing the locations of the 2009). monitors in the conterminous US is shown in Supplemental In the second stage, we fit a spatial model to the ^ u terms Fig. S3. obtained from the first-stage model. This model included The spatio-temporal GAMM of 24-h average PM EC 2.5 terms for GIS-based time-invariant spatial covariates and re- levels was similar to that presented in Yanosky et al. (2014), sidual time-invariant spatial variability. The second stage except there monthly averages were used; the model had the (Eq. 5) was: following generic form: ^ ^ u ¼ α þ ∑ d X þ gsðÞþ b ; i q i;q i i ð5Þ y ¼ α þ α þ ∑ d X þ ∑ f Z þ gðÞ s þgsðÞþ b þ e ; t q i;q i;t;p i i i i;t i;t q p p t b ∼N 0; σ i b 2 2 b ∼N 0; σ ; e ∼N 0; σ i b i;t et where ^ u is an estimated site-specific intercept that represents ð3Þ the adjusted long-term mean at each location (n = 361) and the where y is the natural-log transformed 24-h average PM i, t 2.5 other terms are as above. The second stage was also fit using −3 EC (in ln(ng m )) for i =1…I sites (I =361) and t =1…T 24- the gam()function in the mgcv package of R. htimeperiods (T =1555), and s is the projected spatial coor- dinate pair for the ith location. X are GIS-based time-invari- i, q Model predictions ant spatial covariates for q =1…Q, Z are spatio-temporal i, t, p covariates for p =1…P,and α is a day-specific intercept that We obtained model predictions by generating the covariates represents the adjusted mean across all sites on a given day. at locations of interest for each, and then transforming to the d are one-dimensional penalized spline smooth functions of Q native scale by exponentiation. To avoid extrapolation, co- GIS-based time-invariant spatial covariates (basis dimension variates at prediction locations beyond their range among of seven for each); f are one-dimensional penalized spline the monitoring locations were set to the appropriate mini- smooth functions of P spatio-temporal meteorological or other mum or maximum among the monitoring locations across covariates (basis dimension of five for each). g (s)accounts t i 1999–2011. for residual spatial variability in the 24-h average values, and We generated estimates of uncertainty in model predictions g(s ) accounts for time-invariant spatial variability across the (i.e., standard errors) on the natural-log scale using methods conterminous US, with both terms specified as spatial bivari- described previously (Paciorek et al. 2009; Yanosky et al. ate thin-plate penalized splines with basis dimension values: 2008). 95% prediction intervals on the natural-log scale were k = I ∗ 0.9 and k =(I − Q) ∗ 0.9, respectively. The site- t t calculated and exponentiated to assess prediction interval specific random effect b represents unexplained site-specific coverage. variability, thus our characterization of the model as a GAMM. As in previous work, we used a two-stage modeling ap- Model validation proach to fit the above model (Eq. 3)(Yanosky et al. 2014). In the first stage (Eq. 4), we estimated site-specific intercepts (u ) 10-fold out-of-sample CV was used to evaluate model pre- adjusting for spatio-temporal covariates and residual spatial dictive accuracy and thereby inform covariate selection. variability in the 24-h average values (N = 219,300). The Monitoring sites were selected at random and assigned Air Qual Atmos Health (2018) 11:741–754 747 exclusively to one of 10 sets. Since the covariate selection removed. We also evaluated whether the direction of the process involved fitting multiple candidate models to the smooth functions met a priori expectations; however, no co- same data, set 10 was reserved (i.e., not used for model variates were removed for this reason. fitting) to assess whether the covariate selection process contributed to over-fitting (Draper and Krnjacic 2006). Data from sets one to nine were removed from the data set Results sequentially, with the model fit to the remaining data. Model predictions were then generated and compared to Sensitivity of ADMS-roads output to changes in input the left-out observations. parameters The predictive accuracy of the spatio-temporal PM EC 2.5 model levels was determined from the squared Pearson Wind direction, which determines whether the receptor point correlation between the left-out observations and model is downwind of the source, was found to be the most influen- predictions (CV R ), with both back-transformed to the native tial ADMS-Roads input parameter. The next most influential rather than the natural-log scale. Weekly, monthly, and annual input parameters, evaluated at an arbitrary point 100-m down- averages were calculated after averaging left-out observations wind of the source, were, in order of decreasing influence: and model predictions to the corresponding averaging time. wind speed, surface roughness, sensible heat flux, planetary Spatial CV R values were calculated similarly using the boundary layer height, absolute value of minimum L ,and MO means of the 24-h average values across the entire time period air temperature. These findings were consistent with guidance from 1999 to 2011. Prediction errors were calculated by in the ADMS-Roads User Guide. subtracting left-out observations from the model predictions. Bias in model predictions was determined using the normal- Results of smoothing traffic intensity ized mean bias factor (NMBF) (Shaocai Yu, personal communication) and the slope from major-axis linear The spatial smoothing GAMs of traffic intensity fit the regression (Legendre 2011) of the left-out observations Dynamap traffic count data reasonably well, as evidenced by against the model predictions (both on the natural-log scale). model R values of 0.834, 0.638, 0.533, and 0.590 for A1–A4 The precision of model predictions was obtained by taking the roads, respectively (Table 2). mean of the absolute value of the prediction errors (CVMAE) Spatial smoothing GAMs showed generally similar spatial and using the normalized mean error factor (NMEF) (Shaocai trends across road classes, each showing increased traffic in- Yu, personal communication). Formulas for the NMBF and tensity in major urban areas. However, slightly different spa- NMEF are provided in the Supplemental Material. Bias and tial patterns were evident (Supplemental Material Fig. S4), precision values from CV were evaluated overall, and by US Census Division, tertiles of urban land use, season, monitoring Table 2 Results of the external validation of the spatial prediction network, and monitoring objective. Two thousand seventy- models of Dynamap traffic counts using ATR and WIM station traffic three extreme values of PM EC above the 99th percentile 2.5 count data from the US DOT Bureau of Transportation Statistics’ −3 of 1850 ng m were excluded as outliers prior to calculating A National Transportation Atlas Database 2011 CV statistics. b 2c Road class n Validation R Model development and covariate selection Spatial smoothing 100-km kernel GAM density To determine the best parsimonious prediction model, we fit A1 1712 0.524 0.316 spatio-temporal models of 24-h average PM EC levels with 2.5 A2 1694 0.492 0.313 alternate versions of ADMS-Roads output (A1, A2, A3 and A3 1368 0.305 0.189 A4/A6 vs. A1–A3 only; Bbest-estimate^ vs. Bworst-case downwind^), with and without various meteorological param- ATR refers to the BAutomatic Traffic Recorders^ data; WIM refers to the eters, and without ADMS-Roads output but instead with GIS- BWeigh-in-Motion^ data; DOT is Department of Transportation based spatial covariates for distance to nearest road by road Data for A4 class roads was not available due to the limited number (9) class. The initial set of meteorological covariates was based on and spatial coverage of stations on these roads our earlier work (Yanosky et al. 2014). We compared the Obtained from linear regression of predicted traffic counts using spatial CV R among these models, and also among those with Dynamap data and either the spatial smoothing GAM approach or the 100-km kernel density approach versus combined ATR and WIM traffic and without GIS-based spatial covariates for county-level counts in the external validation data set. Data from ATR or WIM stations population density, urban land use, and elevation in the with traffic count values beyond the range of the Dynamap traffic count second-stage model. Covariates which were not statistically data for each road class were removed because they were likely not significant (p > 0.05) using the Wald tests (Wood 2006)were spatially matched to the correct road class 748 Air Qual Atmos Health (2018) 11:741–754 particularly with regard to the extent of influence of major road^ monitors (those with an A1, A2, or A3 road within cities in surrounding suburban areas. Additional details re- 300 m, which represented about 47% of all monitors), these garding spatial trends in traffic intensity and performance of differences were more evident (CV R of annual averages of the two spatial averaging methods can be found in the 0.762 for ADMS-Roads output summed across A1–A3 roads Supplemental Material. vs. 0.689 for ADMS-Roads output summed across A1, A2, A3, and A4/A6 vs. 0.663 for distance to road covariates). Validation of ADMS-Roads traffic intensity input data Results also showed that the spatio-temporal model that included Bbest-estimate^ ADMS-Roads output performed on- Validation R values from linear regression of the combined ly slightly better than one with Bworst-case downwind^ ATR and WIM traffic counts using the spatial smoothing ADMS-Roads output when comparing long-term averages (spatial CV R = 0. 784 vs. 0.775, respectively). In contrast, GAM predictions of the Dynamap traffic count data were 0.524, 0.492, and 0.305 for road classes A1–A3, respectively on the 24-h average level, the model with Bworst-case downwind^ ADMS-Roads output performed well and even (Table 2). We also regressed results from the kernel density within 100-km (displayed in Supplemental Material Fig. S4 slightly better than the Bbest-estimate^ ADMS-Roads output panels A1c, A2c, A3c, and A4c) approach against the com- (24-h average CV R = 0.539 for Bworst-case downwind^ vs. bined ATR and WIM traffic count data; validation R values 0. 532 for Bbest-estimate^). Despite the influence of daily were lower as compared to the spatial smoothing GAM ap- prevailing wind direction on ADMS-Roads output, these proach at 0.316, 0.313, and 0.189, for road classes A1–A3, models were quite comparable with respect to predictive ac- respectively (Table 2). In all six regression models discussed curacy of PM EC levels, suggesting that our 24-h prevailing 2.5 above, each of the predicted traffic intensity terms was highly wind direction data from MERRA is only crudely capturing statistically significant (p < 0.001) when regressed against the the influence of wind direction; modeling hourly wind direc- tion may provide improvement but is currently impractical at combined ATR and WIM traffic count data for the corre- sponding road class. the national scale. Based on the results above, the final prediction model used Validation of ADMS-Roads output using Bbest-estimate^ ADMS-Roads output from only road classes A1–A3. However, the model with Bworst-case downwind^ spatio-temporal models of 24-h average PM EC 2.5 ADMS-Roads output would provide an acceptable alternative to the Bbest-estimate^ model, especially for analyses focused Predictive performance of spatio-temporal models of 24-h average PM EC on shorter averaging time periods. Also, we note that excluding 2.5 A4 roads results in a large gain in computational efficiency. Results from our model comparisons showed that the spatio- Though predictive accuracy was moderate for PM EC at 2.5 temporal model that included ADMS-Roads output summed the 24-h average level (Table 3; 24-h average CV R =0.532), it improved substantially for longer averaging times (CV R = across only A1–A3 roads exhibited slightly higher predictive accuracy than one that included ADMS-Roads output 0.613, 0.679, 0.707, 0.795, and 0.784 for weekly, two-week, and monthly, annual-average, and spatial averages, respective- summed across A1, A2, A3, and A4/A6 roads (Table 3; spatial CV R = 0.784 vs. 0.753, respectively). This model ly. Predictive accuracy was generally consistent across US Census Divisions (Supplemental Material Table S1). It was also performed better than the simpler model that used distance to nearest road covariates for A1–A3 roads also reasonably consistent across tertiles of urban land use, across the four seasons, and across monitoring networks. (Table 3;spatial CV R = 0.753). Among only Bnear-major- Table 3 Cross-validation statistics for the spatio-temporal prediction model (top row) and two alternate models (lower two rows) of 24-h average PM EC levels in the conterminous US from 1999 to 2011 2.5 Spatio-temporal model 24-h average values CV R of Spatial a 2f covariates annual CV R 2 d d e f n Model CV R Intercept Slope NMBF CVMAE NMEF averages 2B C E E R (%) (%) ADMS-Roads output (A1–A3) 201,425 0.679 0.532 1.4 0.74 − 4.8 112 47.1 0.795 0.784 ADMS-Roads output 201,425 0.682 0.495 1.3 0.75 − 3.7 115 47.9 0.747 0.753 (A1–A4 and A6) No ADMS-Roads output 201,425 0.683 0.472 1.3 0.76 − 4.0 117 48.7 0.723 0.753 but withdistancetoroad (A1–A3) terms Air Qual Atmos Health (2018) 11:741–754 749 Fig. 2 Plot of the smooth function of ADMS-Roads output (summed across road classes A1– A3) from the spatio-temporal PM2.5 EC model levels. Data on both axes have been transformed from the natural log to the native scale. The x-axis shows the distribution of ADMS-Roads output (summed across road classes A1–A3), at the set of monitors used for model fitting, in red Across monitoring objectives, predictive accuracy was slight- total rainfall. Plots of each of these smooth functions are ly lower at Bsource-related^ sites (categories defined by the shown in the Supplemental Material Fig. S5. Also, details US EPA; Bsource-related^ refers to a monitor located nearby regarding the df used by the spatial terms in the spatio- to a large point source; CV R of annual averages = 0.398). temporal PM EC model can be found in the Supplemental 2.5 95% prediction interval coverage was 0.969, as compared to a Material. nominal value of 0.95, indicating that model standard errors though reasonably well scaled, may have been slightly Spatial patterns in spatio-temporal PM EC 2.5 underestimated. model-predicted levels Smooth function of ADMS-Roads output Figure 3 shows predicted 24-h average PM EC levels for 2.5 in the spatio-temporal PM EC model 2 days in 2008 and averaged across that year. Predicted PM 2.5 2.5 EC levels are clearly increased near roadways in each plot. In Fig. 2, the smooth function for ADMS-Roads output Figure 4 shows predicted 24-h average PM EC levels on 2.5 summed across A1–A3 roads (shown on the x-axis) increases a 6-km grid across the conterminous US for August 4 and 10, monotonically and nearly linearly from zero to moderate 2008, and averaged across 2008. Additional details regarding −3 levels around 500 ng m , and then begins to flatten. Once these figures and the statistical assumptions of our spatio- −3 the ADMS-Roads output exceeds 1000 ng m ,PM EC temporal PM EC model can be found in the Supplemental 2.5 2.5 levels again rise, but with a shallower slope. Material. Smooth functions of other covariates in the spatio-temporal PM EC model Discussion 2.5 PM EC levels increased linearly with county-level popula- Comparing the two methods of spatial averaging traffic 2.5 tion density (1.0° of freedom (df)). For urban land use, PM counts, we found that spatial smoothing performed better than 2.5 EC levels increased monotonically but not linearly (4.0 df), the kernel density approach in an external validation using increasing steadily from 0 to about 40% urban land use, less automated traffic recorder data. This is likely due to the dif- steeply from 40 to about 80%, and more steeply from about 80 ferent assumptions regarding sparsity of data in space and the to 100%. For elevation, PM EC levels decreased nearly manner in which each are affected by extreme values. 2.5 linearly from 0 to about 2000 m, and less steeply beyond. For the spatio-temporal PM EC model, the model R of 2.5 PM EC levels decreased nearly linearly with increasing 0.679 indicates the model was able to explain much of the 2.5 wind speed, increased in an S-shaped pattern with increasing variability in measured levels. Local influences that may not air temperature, and decreased non-linearly with increasing be fully described by the spatio-temporal model (i.e., those not 750 Air Qual Atmos Health (2018) 11:741–754 −3 Fig. 3 a Predicted 24-h average PM EC levels (ng m ), in the same average levels for 2008. 5th to 95th percentiles are shown for each. The 2.5 domain as Fig. 1, for August 4, 2008. b Same as A but showing 24-h prevailing wind direction from smoothed MERRA data is shown with average levels for August 10, 2008. c Same as A but showing annual- black arrows in a and b described by model covariates for traffic-related PM levels, those with distance to road terms instead of ADMS-Roads urbanization, elevation, and local meteorology) include local model output and even among only Bnear-major-road^ moni- wildland fires, residential wood burning, and industrial tors, suggests that micro-scale spatial gradients are a relatively sources, all sources of EC. The spatio-temporal PM EC small proportion of PM EC mass (albeit an arguably impor- 2.5 2.5 model does well in representing spatial trends, and especially tant one for health effects analyses). Also, spatial error induced so for averaging times longer than 24-h (CV R for monthly by smoothing sparse traffic count data or imprecision in other and annual averages = 0.707 and 0.795, respectively; spatial ADMS-Roads model inputs may contribute to similarity 2 2 CV R =0.784). among CV R values. Specifically, including the effect of wind The smooth function for ADMS-Roads output (summed direction from the 10-m road source points to the receptor across road classes A1–A3) showed a non-linear but mono- locations improved prediction accuracy as assessed by the tonically increasing relationship between that covariate and spatial CV R only slightly as compared to the Bworst-case 24-h average PM EC levels, highlighting the need to use downwind^ ADMS-Roads output, suggesting that MERRA 2.5 penalized spline smoothing. By including PM EC monitor- 24-h prevailing wind direction data (one value per day) are 2.5 ing data from all sites in the conterminous US in the same of only limited use in describing primary traffic-related PM spatio-temporal model, we capture the largest possible range levels. Though it would represent a substantial increase in in the ADMS-Roads output covariate. computational demand, hourly wind data, which would reflect The similarity in CV results from the spatio-temporal changes in wind direction throughout the day, would likely be models with alternative representations of traffic, including more realistic and may further increase predictive accuracy. Air Qual Atmos Health (2018) 11:741–754 751 −3 Fig. 4 a Predicted 24-h average PM EC levels (in ng m ) on a 6-km c Same as a but showing annual-average levels for 2008. White points are 2.5 grid in the conterminous US (219,948 cell values each day) for August 4, PM EC monitors. 5th to 95th percentiles are shown for each 2.5 2008. b Same as a but showing 24-h average levels for August 10, 2008. In comparison to previous studies, our methodology is sim- from the conterminous US in one spatio-temporal model and ilar to that used in the London Air Pollution Toolkit (Kelly by using spatially smooth covariates, with the exception of et al. 2011) and in Wilton et al. (2010). However, our meth- ADMS-Roads output. Though the ADMS-Roads output does odology is novel in (1) its scope of application, in that it is the contain spatial discontinuities where the traffic intensity first, to our knowledge, to apply a line-source Gaussian plume changes categories, these were quite small (one is visible in model to estimate traffic-related PM levels in a consistent the bottom right of each panel of Fig. 3;thisisanextreme manner across the conterminous US, and (2) its use of spatial- example due to the very high traffic intensity in the area ly smoothed road-class-specific traffic counts. In comparison shown). However, we do assume stationarity across the do- to previous studies, the spatial distribution of long-term mean main. In contrast, the moving-window approach in Akita et al. predicted PM EC in Fig. 4 is broadly similar to that present- (2012) does not require this assumption. 2.5 ed in Bergen et al. (2013). Also, consistent with results in Maintaining computational feasibility was a key challenge Wilton et al. (2010), we found that including ADMS-Roads in the development of our traffic-related PM modeling ap- output on smaller, lesser trafficked A4/A6 roads did not im- proach. Even the spatial join to identify the 10-m road source prove predictive accuracy of the spatio-temporal PM EC points within 2 km of each PM monitoring location was 2.5 2.5 model. computationally demanding, requiring specialized multipro- One advantage of our spatio-temporal model is it avoids cessing software code to be implemented. Categorization of spatial discontinuities in predicted levels, such as those caused the several ADMS-Roads inputs was necessary to make com- by moving-windows in Akita et al. (2012) and the use of putation possible, and to provide a balance of complexity and regional terms in Yanosky et al. (2014), by including data computational feasibility. 752 Air Qual Atmos Health (2018) 11:741–754 Our models will be used to provide estimates of exposure adoption of diesel particulate filters for post-2007 heavy-duty to traffic-related PM and to PM EC in epidemiologic anal- compression ignition engines used in on-road (highway) 2.5 yses. Despite advantages over simpler exposure metrics, our trucks and busses and reduced sulfur content in diesel fuel traffic-related PM modeling approach has several limitations. have and will continue to reduce particulate emissions from One is that annual-average traffic counts do not explain all of new vehicles. the daily variability in traffic intensity; another is that it relies This work emphasizes the need for spatially accurate, con- on spatially averaged traffic counts rather than data on each sistently measured, nationwide, and time-resolved informa- specific road segment. Finely time-resolved (e.g., daily) traffic tion on traffic intensity for each road segment (preferably counts for each road segment in the road network are not distinguishing between light-and heavy-duty vehicles). Also, currently available. To some extent, this is addressed by the siting of additional monitoring sites near existing PM 2.5 performing spatial smoothing separately for each road class. speciation monitors but in locations nearer to large roadways, However, traffic count data are spatially incomplete in that not including those with Bpopulation exposure^ and every segment has a traffic count, so error from spatial inter- Bbackground^ monitoring objectives, would be valuable for polation is unavoidable. Other spatial averaging methods are future development of predictive models of traffic-related air also vulnerable to this problem, such as when imputing the pollutant levels. mean traffic count based on county boundaries. We note that though our traffic intensity inputs vary smoothly in space, the resulting ADMS-Roads output is not smooth in space due to Conclusions our application of kernels to the 10-m road source points; therefore, micro-scale detail, down to the resolution of a few Our approach provides estimates of primary traffic-related PM meters, is retained. Another limitation is that the approach levels based on line-source Gaussian plume dispersion model does not account for bridges, street canyons, tunnels, or other output. The methodology can provide high spatial resolution elevation-related effects or for differences in emission factors (interpolated from a 2-m grid) estimates of traffic-related PM between light-duty and heavy-duty vehicles, the latter because (both Bbest-estimate^ values that incorporate wind direction traffic counts segregated by light- and heavy-duty vehicles and Bworst-case downwind^ values) across the conterminous were unavailable. Finally, meteorological data were averaged US from 1999 to 2011 which account for (1) spatially varying to the daily level and, as discussed above for wind direction, traffic intensity (specific to each road class) and (2) spatio- do not reflect hourly variation in meteorological parameters. temporal covariates describing daily local meteorology. In Other minor limitations include small discontinuities induced contrast, distance to road terms, often used to represent near- by categorization of the input parameters and use of assumed road traffic-related air pollution levels, do not account for road widths and vehicle speeds. However, the inputs were differences in traffic intensity (or at least do not do so within finely divided into a large number of categories to minimize a given road class) or daily local meteorology. Our results spatial discontinuities, and assumed road widths and vehicle demonstrate that ADMS-Roads output describes near-road speeds are likely most influential within 10–20 m of the road primary traffic-related PM levels with greater detail and pre- centerline, an area where relatively few residences exist. dictive accuracy as compared to distance to road terms in a Finally, despite the simplifying assumptions discussed above, spatio-temporal PM EC model; this model had moderate 2.5 the approach presents significant computational challenges prediction accuracy at the 24-h average level (CV R = with respect to processing time, memory, and disk space and 0.532), but better performance when longer-term averages requires parallel computing in order to be implemented on were evaluated (CV R of annual averages = 0.795 and spatial large numbers of locations. CV R = 0.784). Caution is warranted when using ADMS- Our spatio-temporal PM EC model also has several lim- 2.5 Roads output as a separate exposure metric because of the itations. One is that the data set does not contain multiple non-linear shape of the smooth function in the spatio- monitors within small areas with monitors near and far from temporal PM EC model levels (Fig. 2). The spatio- 2.5 major road, data that could help distinguish micro-scale ef- temporal model effectively calibrates the ADMS-Roads out- fects, including that of wind direction, from larger spatial scale put to measured PM EC levels; predictions from it represent 2.5 regional effects. Also, the model makes several statistical as- an improved exposure metric for epidemiologic analyses of sumptions (discussed in the Supplemental Material), such as PM as compared to distance to road terms. stationary and isotropic spatial variation, additive covariate effects, and independent and normally distributed model re- Acknowledgements We thank Bin Wang for her contribution to the siduals with mean zero and constant variance. Finally, a lim- Python programming of several custom Python scripts. itation of all models of exposure to combustion-produced PM is the combination of continually evolving engine mod- 2.5 Authors’ contributions JDY led and directed the study, performed the ifications, after-treatment technologies, and fuel blends. The analysis, and was the lead writer. JF contributed to the analysis, reviewed Air Qual Atmos Health (2018) 11:741–754 753 the manuscript, and participated in revisions. DL reviewed the manu- Grioni S, Krogh V, Tsai MY, Ricceri F, Sacerdote C, Galassi C, script. DR reviewed the manuscript. RVW reviewed the manuscript. Migliore E, Ranzi A, Cesaroni G, Badaloni C, Forastiere F, WG reviewed the manuscript. RCP led the parent study, contributed Tamayo I, Amiano P, Dorronsoro M, Katsoulis M, Trichopoulou meaningful intellectual ideas, supervised JF, and reviewed the manu- A, Brunekreef B, Hoek G (2014) Effects of long-term exposure to script. All authors read the manuscript, provided input, and approved air pollution on natural-cause mortality: an analysis of 22 European the version as submitted. cohorts within the multicentre ESCAPE project. Lancet 383(9919): 785–795 Benson PE (1992) A review of the development and application of the Funding This research was supported by National Institutes of Health CALINE3 and CALINE4 models. Atmos Environ B Urban Atmos grant #1 R01 ES019168. 26(3):379–390 Bergen S, Sheppard L, Sampson PD, Kim S, Richards M, Vedal S et al Compliance with ethical standards (2013) A national prediction model for PM component exposures 2.5 and measurement error–corrected health effect inference. Environ Ethics approval and consent to participate This study was part of the Health Persp 121(9):1017–1025 SEARCH For Air project. Informed consent was obtained from study Brochu P, Kioumourtzoglou MA, Coull BA, Hopke PK, Suh HH (2011) subjects and approval was obtained from the IRB’s of both the Development of a new method to estimate the regional and local University of Maryland and the Penn State College of Medicine. contributions to black carbon. Atmos Env 45(40):7681–7687 Brugge D, Lane K, Padró-Martínez LT, Stewart A, Hoesterey K, Weiss D, Consent for publication Not Applicable. Wang DD, Levy JI, Patton AP, Zamore W, Mwamburi M (2013) Highway proximity associated with cardiovascular disease risk: the influence of individual-level confounders and exposure misclassifi- Conflict of interest The authors declare that they have no competing cation. Environ Health 12(1):84 interests. R Development Core Team (2009) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Abbreviations ADMS, Atmospheric Dispersion Modeling System; Vienna, Austria. ISBN 3-900051-07-0 Available at: http://www.R- ATR, automated traffic recorders; CV, cross-validation; EC, elemental project.org carbon; IMPROVE, Interagency Monitoring of Protected Visual Draper D, Krnjacic M (2006) Bayesian model specification. In: Technical Environments; GAM, generalized additive model; GAMM, generalized report, Dept. Univ. California Santa Cruz, Applied Mathematics and additive mixed model; GIS, geographic information system; L , MO Statistics Monin-Obhukov length; MERRA, Modern Era Reanalysis for Research Eckel SP, Berhane K, Salam MT, Rappaport EB, Linn WS, Bastain TM, and Applications; NO , nitrogen oxide; NO , nitrogen dioxide; PM, par- x 2 Zhang Y, Lurmann F, Avol EL, Gilliland FD (2011) Residential ticulate matter; PM , PM < 2.5 μm; PM ,PM <10 μm; USGS, US 2.5 10 traffic-related pollution exposures and exhaled nitric oxide in the Geological Survey; WIM, Weigh In Motion Children’s Health Study. Environ Health Persp 119(10):1472–1477 Gehring U, Heinrich J, Kramer U, Grote V, Hochadel M, Sugiri D et al Open Access This article is distributed under the terms of the Creative (2006) Long-term exposure to ambient air pollution and cardiopul- Commons Attribution 4.0 International License (http:// monary mortality in women. Epidemiology 17(5):545–551 creativecommons.org/licenses/by/4.0/), which permits unrestricted use, Golan R, Ladva C, Greenwald R, Krall JR, Raysoni AU, Kewada P, distribution, and reproduction in any medium, provided you give Winquist A, Flanders WD, Liang D, Sarnat JA (2018) Acute pul- appropriate credit to the original author(s) and the source, provide a link monary and inflammatory response in young adults following a to the Creative Commons license, and indicate if changes were made. scripted car commute. Air Qual Atmos & Health 11:123–136 Gryparis A, Coull BA, Schwartz J, Suh HH (2007) Semiparametric latent variable regression models for spatiotemporal modelling of mobile source particles in the greater Boston area. Journ Royal Stat Soc Ser References C Appl Stat 56:183–209 Hart JE,LiaoX,HongB,Puett RC, YanoskyJD, SuhH, Akita Y, Chen J-C, Serre ML (2012) Moving-window BME: estimation Kioumourtzoglou MA, Spiegelman D, Laden F (2015) The associ- of PM annual averages. J Exp Sci Env Epi 22(5):496–501 ation of long-term exposure to PM on all-cause mortality in the 2.5 2.5 Nurses’ Health Study and the impact of measurement-error correc- Anderson JO, Thundiyil JG, Stolbach A (2012) Clearing the air: a review tion. Environ Health 14(1):38 of the effects of particulate matter air pollution on human health. J Med Toxicol 8:166–175 Heinrich J, Thiering E, Rzehak P, Kramer U, Hochadel M, Rauchfuss KM et al (2013) Long-term exposure to NO and PM and all-cause Arain MA, Blair R, Finkelstein N, Brook JR, Sahsuvaroglu T, Beckerman 2 10 and cause-specific mortality in a prospective cohort of women. B, Zhang L, Jerrett M (2007) The use of wind fields in a land use Occup Environ Med 70(3):179–186 regression model to predict air pollution concentrations for health Hoffmann B, Moebus S, Mohlenkamp S, Stang A, Lehmann N, Dragano exposure studies. Atmos Environ 41(16):3453–3464 N, Schmermund A, Memmesheimer M, Mann K, Erbel R, Jockel Beelen R, Raaschou-Nielsen O, Stafoggia M, Andersen ZJ, Weinmayr G, KH, for the Heinz Nixdorf Recall Study Investigative Group (2007) Hoffmann B, Wolf K, Samoli E, Fischer P, Nieuwenhuijsen M, Residential exposure to traffic is associated with coronary athero- Vineis P, Xun WW, Katsouyanni K, Dimakopoulou K, Oudin A, sclerosis. Circulation 116(5):489–496 Forsberg B, Modig L, Havulinna AS, Lanki T, Turunen A, Oftedal Jerrett M, Arain A, Kanaroglou P, Beckerman B, Potoglou D, B, Nystad W, Nafstad P, de Faire U, Pedersen NL, Östenson CG, Sahsuvaroglu T, Morrison J, Giovis C (2005) A review and evalu- Fratiglioni L, Penell J, Korek M, Pershagen G, Eriksen KT, Overvad ation of intraurban air pollution exposure models. 2005. Journ K, Ellermann T, Eeftens M, Peeters PH, Meliefste K, Wang M, Exposure Anal Environ Epi 15(2):185–204 Bueno-de-Mesquita B, Sugiri D, Krämer U, Heinrich J, de Hoogh K, Key T, Peters A, Hampel R, Concin H, Nagel G, Ineichen A, Kelly F, Anderson HR, Armstrong B, Atkinson R, Barratt B, Beevers S Schaffner E, Probst-Hensch N, Künzli N, Schindler C, Schikowski et al (2011) The impact of the congestion charging scheme on air T, Adam M, Phuleria H, Vilier A, Clavel-Chapelon F, Declercq C, quality in London. Part 1. Emissions modeling and analysis of air 754 Air Qual Atmos Health (2018) 11:741–754 pollution measurements. Research Report / Health Effects Institute Silverman BW. 1986. Density Estimation for Statistics and Data 155:5–71 Analysis. New York: Chapman and Hall:76. Equation 4.5 Legendre P. lmodel2: Model II Regression. R package version 1.7–0. Stieb DM, Chen L, Eshoul M, Judek S (2012) Ambient air pollution, birth 2011. Available at: http://CRAN.R-project.org/package=lmodel2 weight and preterm birth: a systematic review and meta-analysis. Maynard D, Coull BA, Gryparis A, Schwartz J (2007) Mortality risk Environ Res 117:100–111 associated with short-term exposure to traffic particles and sulfates. US Bureau of Transportation Statistics. 2013. National Transportation Environ Health Persp 115(5):751–755 Atlas web page. Available at: http://www.rita.dot.gov/bts/sites/rita. Medina-Ramon M, Goldberg R, Melly S, Mittleman MA, Schwartz J dot.gov.bts/files/publications/national_transportation_atlas_ (2008) Residential exposure to traffic-related air pollution and sur- database/2011/index.html [accessed November 16, 2013] vival after heart failure. Environ Health Persp 116(4):481–485 US Census Bureau. 2013. 2010 TIGER/Line Shapefiles web page. Paciorek CJ, Yanosky JD, Puett RC, Laden F, Suh HH (2009) Practical Available at: http://www.census.gov/cgi-bin/geo/shapefiles2010/ large-scale spatio-temporal modeling of particulate matter concen- main [accessed February 9, 2013] trations. Ann Appl Stat 3:370–397 US Geological Survey. 2012. National Land Cover Dataset web page. Pelucchi C, Negri E, Gallus S, Boffetta P, Tramacere I, La Vecchia C Available at: http://www.mrlc.gov/nlcd2001.php [accessed June 1, (2009) Long-term particulate matter exposure and mortality: a re- 2012] view of European epidemiological studies. BMC Public Health 9: US Geological Survey. 2013. National Elevation Dataset web page. 453 Available at: http://nationalmap.gov/elevation.html [accessed Puett RC, Hart JE, Schwartz J, Hu FB, Liese AD, Laden F (2011) Are May 19, 2013] particulate matter exposures associated with risk of type 2 diabetes? Weuve J, Puett RC, Schwartz J, Yanosky JD, Laden F, Grodstein F (2012) Environmental Health Persp 119(3):384–389 Exposure to particulate air pollution and cognitive decline in older Puett RC, Hart JE, Yanosky JD, Spiegelman D, Wang M, Fisher J et al women. Arch Intern Med 172(3):219–227 (2014) Particulate matter air pollution exposure, distance to road, Wilton D, Szpiro A, Gould T, Larson T (2010) Improving spatial concen- and lung cancer risk in the Nurses’ Health Study cohort. Environ tration estimates for nitrogen oxides using a hybrid meteorological Health Persp 122:926–932 dispersion/land use regression model in Los Angeles, CA and Rich DQ, Utell MJ, Croft DP, Thurston SW, Thevenet-Morrison K, Seattle, WA. Sci Tot Environ 408(5):1120–1130 Evans KA, Ling FS, Tian Y, Hopke PK (2018) Daily land use Wood SN (2006) Generalized additive models: an introduction with R. regression estimated woodsmoke and traffic pollution concentra- Chapman & Hall/CRC, Boca Raton tions and the triggering of ST-elevation myocardial infarction: a Yanosky JD, Paciorek CJ, Schwartz J, Laden F, Puett R, Suh HH (2008) case-crossover study. Air Qual Atmos & Health 11:239–244 Spatio-temporal modeling of chronic PM exposure for the Nurses’ Schikowski T, Sugiri D, Ranft U, Gehring U, Heinrich J, Wichmann HE, Health Study. Atmos Environ 42:4047–4062 Krämer U (2005) Long-term air pollution exposure and living close Yanosky JD, Paciorek CJ, Laden F, Hart J, Puett R, Suh HH (2014) to busy roads are associated with COPD in women. Respir Res 6: Spatio-temporal modeling of particulate air pollution in the conter- 152 minous United States using geographic and meteorological predic- Shah AS, Langrish JP, Nair H, McAllister DA, Hunter AL, Donaldson K tors. Environ Health 13(1):63 et al (2013) Global association of air pollution and heart failure: a Zhou Y, Levy JI (2007) Factors influencing the spatial extent of mobile systematic review and meta-analysis. Lancet 382(9897):1039–1048 source air pollution impacts: a meta-analysis. BMC Pub Health 7:89 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Air Quality, Atmosphere & Health Springer Journals

Application and validation of a line-source dispersion model to estimate small scale traffic-related particulate matter concentrations across the conterminous US

Free
14 pages
Loading next page...
 
/lp/springer_journal/application-and-validation-of-a-line-source-dispersion-model-to-5SPR60JJsT
Publisher
Springer Netherlands
Copyright
Copyright © 2018 by The Author(s)
Subject
Environment; Environmental Health; Atmospheric Protection/Air Quality Control/Air Pollution; Health Promotion and Disease Prevention
ISSN
1873-9318
eISSN
1873-9326
D.O.I.
10.1007/s11869-018-0580-6
Publisher site
See Article on Publisher Site

Abstract

Numerous studies document increased health risks from exposure to traffic and traffic-related particulate matter (PM). However, many studies use simple exposure metrics to represent traffic-related PM, and/or are limited to small geographic areas over relatively short (e.g., 1 year) time periods. We developed a modeling approach for the conterminous US from 1999 to 2011 that applies a line-source Gaussian plume dispersion model using several spatially and/or temporally varying inputs (including daily meteorology) to produce high spatial resolution estimates of primary near-road traffic-related PM levels. We compared two methods of spatially averaging traffic counts: spatial smoothing generalized additive models and kernel density. Also, we evaluated and validated the output from the line-source dispersion modeling approach in a spatio-temporal model of 24-h average PM < 2.5 μm(PM ) elemental carbon (EC) levels. We found that spatial smoothing of traffic count point data performed better 2.5 than a kernel density approach. Predictive accuracy of the spatio-temporal model of PM EC levels was moderate for 24-h 2.5 2 2 averages (cross-validation (CV) R = 0.532) and higher for longer averaging times (CV R = 0.707 and 0.795 for monthly and annual averages, respectively). PM EC levels increased monotonically with line-source dispersion model output. Predictive 2.5 accuracy was higher when the spatio-temporal model of PM EC included line-source dispersion model output compared to 2.5 distance to road terms. Our approach provides estimates of primary traffic-related PM levels with high spatial resolution across the conterminous US from 1999 to 2011. Spatio-temporal model predictions describe 24-h average PM EC levels at unmea- 2.5 sured locations well, especially over longer averaging times. . . . . Keywords Air pollution Spatial smoothing Highway proximity Traffic counts Dispersion models Electronic supplementary material The online version of this article (https://doi.org/10.1007/s11869-018-0580-6) contains supplementary material, which is available to authorized users. * Jeff D. Yanosky Robin C. Puett jyanosky@phs.psu.edu rpuett@umd.edu Jared Fisher Department of Public Health Sciences, The Pennsylvania State jafisher@umd.edu University College of Medicine, Hershey, PA, USA Department of Epidemiology and Biostatistics, University of Duanping Liao Maryland School of Public Health, College Park, MD, USA dliao@phs.psu.edu Department of Architectural Engineering, The Pennsylvania State Donghyun Rim University College of Engineering, State College, PA, USA drim@engr.psu.edu John and Willie Leone Family Department of Energy and Mineral Engineering & The EMS Energy Institute, The Pennsylvania State Randy Vander Wal University College of Earth and Mineral Sciences, State College, PA, ruv12@psu.edu USA William Groves Maryland Institute of Applied Environmental Health, University of wag10@psu.edu Maryland School of Public Health, College Park, MD, USA 742 Air Qual Atmos Health (2018) 11:741–754 Background Gaussian plume dispersion modeling approach that estimates daily 24-h average traffic-related PM concentrations near Recent epidemiologic studies have documented increased roadways which can be applied at any location in the conter- health risks from exposure to atmospheric particulate matter minous US from 1999 to 2011. Also, we developed and val- (PM) (Anderson et al. 2012; Beelen et al. 2014;Pelucchi etal. idated a spatio-temporal model of PM elemental carbon 2.5 2009;Shah etal. 2013;Hartet al. 2015; Heinrich et al. 2013; (EC) using 24-h average measurements of this component of Weuve et al. 2012; Stieb et al. 2012;Rich etal. 2018; Golan PM ,PM EC, together with several meteorological and 2.5 2.5 et al. 2018) for a variety of health outcomes, including non- GIS-based spatial covariates (including traffic-related PM accidental mortality, cardiovascular disease mortality, lung levels from our line-source dispersion modeling approach). cancer, neurocognitive functioning, and effects on reproduc- These models will be used to provide estimates of exposure tion. Studies using proxy measures for exposure to traffic- in epidemiologic analyses. We chose to present results for related air pollution have generally reported health effects PM EC because it has both traffic-related local sources 2.5 larger than those for the mass concentration of PM < 2.5 μm and larger-scale regional gradients (Brochu et al. 2011)and (PM )orPM <10 μm(PM ) in aerodynamic diameter also had high predictive accuracy (spatial cross-validation 2.5 10 (Brugge et al. 2013; Gehring et al. 2006; Hoffmann et al. (CV) R = 0.784 (described later)); PM components Zn and 2007; Puett et al. 2011; Puett et al. 2014;Schikowski et al. Cu were also evaluated but had lower predictive accuracy 2005). Several of these studies used simple exposure metrics (spatial CV R = 0.691 and 0.501, respectively). We chose for traffic-related air pollution based on distance to nearest not to use NO /NOx as our exposure metric due to our interest road, or summaries of road geography within buffers of in PM health effects and also to the complexity of NO-NO -O 2 3 fixed radii (Eckel et al. 2011;Medina-Ramonetal. 2008). chemistry in areas very near roadways, though we may address More complex approaches have been applied within small this in future work. Our spatio-temporal model predicts 24-h geographic areas, typically to one urban area or municipal- average PM EC levels across the conterminous US with high 2.5 ity (Gryparis et al. 2007; Maynard et al. 2007). One such spatial resolution from 1999 to 2011, and thus can provide approach accounted for wind direction using a geographic acute (24-h), sub-acute, or chronic exposure estimates that ac- information system (GIS) to calculate the traffic density in count for traffic intensity, prevailing wind direction, and dis- locations upwind of receptors (Arain et al. 2007). Traffic- persion characteristics based on local meteorology. related air pollutant levels can vary on small (i.e., micro (0– 100 m), middle (100–500 m), and neighborhood (500– 4 km)) spatial scales immediately near roadways (Jerrett Methods et al. 2005; Zhou and Levy 2007). Though the above com- plex approaches likely described local gradients in traffic- To approximate micro-, middle-, and neighborhood-scale pri- related air pollutants better than simpler methods such as mary traffic-related PM levels, we applied a line-source distance to road terms, few direct, robust, and long-term Gaussian plume dispersion model, Atmospheric Dispersion multi-city comparisons are available. Uncertainty remains Modeling System (ADMS)-Roads v2.3 software (CERC, about whether and to what extent noise may be a confound- Cambridge, UK), to a hypothetical grid (hereafter referred to er in analyses of health effects of traffic-related PM expo- as a kernel) with a 2-m cell size under varying meteorological sure, which underscores the need for accurate and highly regimes. We calculated traffic-related PM levels at each grid spatially resolved estimation of traffic-related PM levels so point in the kernel for each combination of ADMS-Roads these effects can be better disentangled in future epidemio- inputs, and then mapped receptor locations onto the appropri- logic analyses. ate kernel, with rotation for wind direction if appropriate, to Gaussian plume dispersion models have been widely used estimate traffic-related PM levels. We chose ADMS-Roads as to model point sources (Jerrett et al. 2005) and have been opposed to other line-source models such as CALRoads-View adapted to area and line sources (Benson 1992). Wilton et al. because of ADMS-Roads’ more advanced treatment of atmo- (2010) used such a model, CALRoads-View, to describe small spheric stability as characterized by continuous Monin- scale gradients in nitrogen oxide (NO ) and nitrogen dioxide Obhukov length (L ) and planetary boundary layer height x MO (NO )levelsfor a2-weekperiodin2006ingreater Los (inputs which are directly available from the Modern Era Angeles and for a 2-week period in 2005 in Seattle. They Retrospective-analysis for Research and Applications found that including the output from CALRoads-View im- (MERRA; http://disc.sci.gsfc.nasa.gov/daac-bin/FTPSubset. proved the prediction capability of city-specific spatio-tempo- pl; see below), resulting in a continuous and non- ral models of NOx and NO levels compared to traditional overlapping metric of atmospheric stability, as opposed to land-use regression models. conventional stability classes. Two hypothetical 5-m road seg- The objective of the present study was to describe the de- ments, facing each other to eliminate vehicle direction effects, velopment of and evaluate the performance of a line-source were placed at the origin of each kernel. The domain of each Air Qual Atmos Health (2018) 11:741–754 743 kernel was 1 km wide by 2.05 km long (2 km downwind of daily models were fit separately for each of seven geo- source and 0.05 km upwind). The ADMS-Roads modeling graphic regions (with an overlap of 400 km with adjacent approach requires several spatially and/or temporally varying regions) of the conterminous US (Supplemental Material inputs in order to produce estimates of near-road traffic-related Fig. S3), yielding seven spatial smoothing GAMs per day PM concentrations (referred to as ADMS-Roads output): (1) to be used for space-time prediction of each daily meteoro- traffic intensity for each of four road classes (A1, A2, A3, and logical parameter at any location in the domain of the con- A4/A6), (2) daily prevailing wind direction, and (3) several terminous US. The distribution of each of these meteoro- parameters which affect pollutant dispersion: wind speed, sur- logical parameters was divided into several categories, face roughness, sensible heat flux, planetary boundary layer listed in Table 1, with details of this classification provided height, road width, vehicle speed, and absolute value of min- in the Supplemental Material. imum L . An example kernel is shown in Supplemental Location-specific surface roughness and absolute value MO Material Fig. S1. The derivation and estimation of these spa- of minimum L values were derived, in accordance with MO tially varying or spatio-temporally varying input parameters is guidance in the ADMS-Roads software, based on categori- described in the following sections. Later we discuss the eval- zation of urban land use data from the 2001 US Geological uation and validation of ADMS-Roads output using spatio- Survey (USGS) National Land Cover Dataset (USGS 2012; temporal models of 24-h average PM elemental carbon Table 1).Theoriginal30-mcellsizedataweresummarized 2.5 (EC) levels. in ArcGIS v.10 (Environmental Systems Research Institute, Redlands, CA) using a moving window with a 1-km radius Availability of data and material Restrictions apply to the to calculate the percentage of low, medium, and high- availability of the data used and analyzed in this study, some intensity developed land use. Categories of these values of which were used under license for the current study and so are shown in Table 1. are not publicly available. Because we found it had no substantive effect on ADMS- Roads output in the range applicable over the conterminous US, air temperature was assumed to equal 15° C for all ADMS-Roads output. Meteorological inputs to ADMS-Roads Traffic intensity inputs to ADMS-Roads Meteorological data on wind speed and direction (as u- and v- vector wind components at ~10 m above ground), sensi- We obtained data on annual-average daily traffic counts (here- ble heat flux, and planetary boundary layer height were after referred to as traffic counts) from Geographic Data obtained from the MERRA project. These data are available Technology, Inc. (Lebanon, NH) Dynamap Traffic Counts hourly on a grid across the continental US with an approx- v4.2. We first spatially joined the traffic count points (those imately 55-km (depending on latitude) cell size; the native that reported in or after 1990) to the ESRI StreetMap Pro 2007 grid is 0.5° latitude/longitude. Hourly gridded data were network of road segments to obtain the US Census Feature assigned local time and date based on the time zone of each Class Code (US Census Bureau 2013) road class (hereafter grid point’s location and then averaged by day (for wind referred to as road class) of each road segment. Details regard- direction, we averaged u- and v- vector components sepa- ing the processing of these data can be found in the rately, then calculated the direction of the resulting vector) Supplemental Material. From the processed traffic intensity from 1999 to 2011. To allow spatial prediction between the data, we created four spatial smoothing GAMs: one for each 55-km grid points, the daily averages of each parameter road class (A1, A2, A3, and A4/A6; results shown in were used to create spatial smoothing generalized additive Supplemental Material Fig. S4 panels A1b, A2b, A3b, and models (GAMs) of the form: A4b, respectively). These spatial smoothing GAMs were of the same form as in Eq. 1 and were validated by calculating y ¼ α þ gsðÞþ ε ; ε ∼N 0; σ ð1Þ t i it it t it the squared Pearson correlation between predicted road-class- specific traffic intensities and those measured in an external where y is the value of the meteorological parameter at validation data set from the US Bureau of Transportation it location i =1…I on day t =1…T, α is an intercept Statistics (BTS) 2011 National Transportation Atlas representing the adjusted mean on a given day, g(s)isa Database Automated Traffic Recorders (ATR; US BTS two-dimensional penalized thin-plate smoothing spline, 2013) and Weigh in Motion (WIM) (US BTS 2013)data sets. with basis dimension k = I ∗ 0.9 chosen to allow much flex- Details regarding the development of the spatial smoothing ibility in the function. To reduce the potential for over- GAMs of traffic intensity and the processing of the external fitting, a multiplier of 1.4 (using the gamma argument to traffic count validation data can be found in the Supplemental gam()), as recommended by Wood (2006), was used. These Material. For comparison purposes, we also calculated the 744 Air Qual Atmos Health (2018) 11:741–754 Table 1 Categories of the meteorological, traffic intensity, and other input parameters to ADMS-Roads to produce the kernels Category (midpoint Group/parameter [minimum, maximum)) Meteorological inputs to ADMS-Roads Traffic intensity inputs to ADMS-Roads Wind speed Sensible heat Planetary boundary Urbanized land Surface Absolute value Average hourly Average hourly Average hourly Average hourly −1 −2 (m s ) flux (W m ) layer height (m) use within 1 km roughness of minimum traffic count for traffic count for traffic count for traffic count for a b b,c d d d d (%) (m) L (m) A1 roads A2 roads A3 roads A4 roads MO 1 0.75 [0, 1.1) − 35 (≤ 10) 70 (< 140) [0–5) 0.2 10 175 [0, 350) 75 [0, 150) 38 [0, 75) 15 [0, 30) 2 1.50 [1.1, 1.9) 15 [− 10, 40) 210 [140, 280) [5–15) 0.3 10 525 [350, 700) 225 [150, 300) 113 [75, 150) 45 [30, 60) 3 2.25 [1.9, 2.6) 65 [40, 90) 350 [280, 420) [15–30) 0.4 10 875 [700, 1050) 375 [300, 450) 188 [150, 225) 75 [60, 90) 4 3.00 [2.6, 3.4) 115 [90, 140) 490 [420, 560) [30–55] 0.5 30 1225 [1050,1400) 525 [450,600) 263 [225, 300) 105 [90, 120) 5 3.75 [3.4, 4.1) 165 [140, 190) 630 [560, 700) [55–90) 1 30 1575 [1400, 1750) 675 [600, 750) 338 [300, 375) 135 [120, 150) 6 4.50 [4.1, 4.9) 215 [190, 240) 770 [700, 840) [90–100] 1 100 1925 [1750, 2100) 825 [750, 900) 413 [375, 450) 165 [150, 180) 7 5.25 [4.9, 5.6) 265 [240, 290) 910 [840, 980) 2275 [2100, 2450) 975 [900, 1050) 488 [450, 525) 195 [180, 210) 8 6.00 [5.6, 6.4) 315 (≥ 290) 1050 [980, 1120) 2625 [2450, 2800) 1125 [1050, 1200) 563 [525, 600) 225 [210, 240) 9 6.75 [6.4, 7.1) 1190 [1120, 1260) 2975 [2800, 3150) 1275 [1200, 1350) 638 [600, 675) 255 [240, 270) 10 7.50 [7.1, 7.9) 1330 [1260, 1400) 3325 [3150, 3500) 1425 [1350, 1500) 713 [675, 750) 285 [270, 300) 11 8.25 [7.9, 8.6) 1470 [1400, 1540) 3675 [3500, 3850) 1575 [1500, 1650) 788 [750, 825) 315 (≥ 300) 12 9.00 [8.6, 9.4) 1610 [1540, 1680) 4025 [3850, 4200) 1725 [1650, 1800) 863 (≥ 825) 13 9.75 [9.4, 10.1) 1750 [1680, 1820) 4375 [4200, 4550) 1875 [1800, 1950) 14 10.50 [10.1, 10.9) 1890 (≥ 1820) 4725 [4550, 4900) 2025 [1950, 2100) 15 11.25 [10.9, 11.6) 5075 [4900, 5250) 2175 (≥ 2100) 16 12.00 [11.6, 12.4) 5425 [5250, 5600) 17 12.75 [12.4, 13.1) 5775 (≥ 5600) 18 13.50 [13.1, 13.9) 19 14.25 (≥ 13.9) From the 2001 US Geological Survey National Land Cover Dataset; defined as the percentage of developed low, medium, and high intensity land use within 1km Values assigned based on categories in the ADMS-Roads software and its documentation using urbanized land use category L refers to the Monin-Obhukov length MO d −1 From spatial smoothing GAMs, transformed into units of vehicles hour . Road widths were assumed to vary based on road class, with values of 22, 14.7, 11, and 7.3 m for road classes A1, A2, A3, and −1 both A4 and A6, respectively. Also, vehicle speeds of 90, 70, 55, and 40 km h , respectively, were assumed Air Qual Atmos Health (2018) 11:741–754 745 kernel density of the traffic count data using a neighborhood GAMs, the ADMS-Roads output concentrations were inter- of 100 km and the ArcGIS BKernel Density^ tool (Silverman polated from the appropriate kernel, for both Bbest-estimate^ 1986). We compared these values to the combined ATR and levels which depended on prevailing wind direction and for WIM data by road class using linear regression models. Bworst-case downwind^ levels ignoring wind direction. A de- Because detailed geographic data on road widths and vehi- tailed description of the steps is provided in the Supplemental cle speeds across the conterminous US were not available, Material. representative road width and vehicle speed values were as- sumed to be constant based on road class (Table 1). Other meteorological and GIS-based spatial To provide road source location data, StreetMap Pro 2007 covariates street line segments were converted to a series of points spaced 10 m apart using the Xtools Pro tool BConvert Data on meteorological parameters air temperature, total pre- Features to Points^ (Data East, Russia; examples in Fig. 1 cipitation, and total snowfall were obtained from MERRA and and Supplemental Material Fig. S2). The emissions factors processed as described above. Data on daily total rainfall were we used in ADMS-Roads can be found in the Supplemental derived by subtraction of total snowfall flux from total precip- Material. We note that our use of a smooth regression spline in itation flux for each day. Though these parameters were not the spatio-temporal model of 24-h average PM EC (hereaf- used as inputs to ADMS-Roads, daily spatially smoothed me- 2.5 ter referred to as Bspatio-temporal PM EC model^)de- teorological values were evaluated as spatio-temporal covari- 2.5 scribed below provides an effective scaling of this emission ates in spatio-temporal models of 24-h average PM EC. 2.5 factor to measured PM EC levels. Thus, our modeling pro- Details on other spatial covariates including county-level pop- 2.5 cess eliminates discrepancies between European and US ve- ulation density, elevation (USGS 2013), and distance to road hicle fleets and between assumed and actual emission rates. covariates can be found in the Supplemental Material. Obtaining ADMS-Roads output Validation of ADMS-Roads output using spatio-temporal models of 24-h average PM EC 2.5 A separate kernel containing ADMS-Roads output was gen- erated for each combination of meteorological and traffic in- Statistical models tensity categories (Table 1; details in Supporting Material). Briefly, to obtain ADMS-Roads output, we first selected the Spatio-temporal generalized additive mixed models kernel that best matched the dispersion and traffic intensity (GAMMs) of 24-h average PM EC levels were evaluated 2.5 conditions, then identified road source points within 2 km of and a final prediction model developed with the joint aims of each prediction location. Together with meteorological and (1) validating and calibrating the ADMS-Roads output to a traffic intensity inputs predicted from spatial smoothing traffic-related PM component, and (2) predicting 24-h 2.5 Fig. 1 Roadways displayed by US Census feature class code and 10-m street points nearby to a PM speciation monitor near 2.5 Linden, New Jersey 746 Air Qual Atmos Health (2018) 11:741–754 average PM EC levels at unmeasured locations anywhere in spatial terms assumed residual spatial variation was station- 2.5 the conterminous US. There were 219,300 24-h site-average arity and isotropic across the domain of the conterminous US. PM EC measurements from 1999 to 2011 at 361 unique The first-stage model equation was: 2.5 monitoring locations. These measurements were available y ¼ u þ α þ ∑ f Z þ gðÞ s þ e ; i t i;t;p i i;t i;t p p t from the Interagency Monitoring of Protected Visual ð4Þ e ∼N 0; σ i;t et Environments (IMPROVE) network (parameter code 88321; geographic coordinates for two sites were modified, see and was fit iteratively in a back-fitting arrangement with u + Supplemental Material for details) and the US Environmental α + ∑ f (Z ) estimated jointly and g (s ) estimated sepa- t p p i, t, p t i Protection Agency’s Chemical Speciation Network (parameter −1 rately by day (on days where sufficient (> 12 values day ) code 88380) on every-third day after August 2001, and data were available). Variability in the measured concentra- alternating every-third and fourth day from January 1, 1999, tions is parsed between the covariates and the residual spatial to that date. 89.7% of available measurements were from the terms. For the spatial models in the first stage, a multiplier of IMPROVE network. PM EC measurements were approxi- 2.5 1.4 (using the gamma argument to gam()) was used to avoid mately log-normally distributed, and had geometric mean and over-fitting (Wood 2006). All individual fits within the back- −3 geometric standard deviation of 154 ng m and 10, respec- fitting were preformed using the gam()function in the mgcv tively (PM EC measurements ranged from 9.4 to 2.5 package (Wood 2006)ofR(R DevelopmentCoreTeam −3 10,398.8 ng m ). A map showing the locations of the 2009). monitors in the conterminous US is shown in Supplemental In the second stage, we fit a spatial model to the ^ u terms Fig. S3. obtained from the first-stage model. This model included The spatio-temporal GAMM of 24-h average PM EC 2.5 terms for GIS-based time-invariant spatial covariates and re- levels was similar to that presented in Yanosky et al. (2014), sidual time-invariant spatial variability. The second stage except there monthly averages were used; the model had the (Eq. 5) was: following generic form: ^ ^ u ¼ α þ ∑ d X þ gsðÞþ b ; i q i;q i i ð5Þ y ¼ α þ α þ ∑ d X þ ∑ f Z þ gðÞ s þgsðÞþ b þ e ; t q i;q i;t;p i i i i;t i;t q p p t b ∼N 0; σ i b 2 2 b ∼N 0; σ ; e ∼N 0; σ i b i;t et where ^ u is an estimated site-specific intercept that represents ð3Þ the adjusted long-term mean at each location (n = 361) and the where y is the natural-log transformed 24-h average PM i, t 2.5 other terms are as above. The second stage was also fit using −3 EC (in ln(ng m )) for i =1…I sites (I =361) and t =1…T 24- the gam()function in the mgcv package of R. htimeperiods (T =1555), and s is the projected spatial coor- dinate pair for the ith location. X are GIS-based time-invari- i, q Model predictions ant spatial covariates for q =1…Q, Z are spatio-temporal i, t, p covariates for p =1…P,and α is a day-specific intercept that We obtained model predictions by generating the covariates represents the adjusted mean across all sites on a given day. at locations of interest for each, and then transforming to the d are one-dimensional penalized spline smooth functions of Q native scale by exponentiation. To avoid extrapolation, co- GIS-based time-invariant spatial covariates (basis dimension variates at prediction locations beyond their range among of seven for each); f are one-dimensional penalized spline the monitoring locations were set to the appropriate mini- smooth functions of P spatio-temporal meteorological or other mum or maximum among the monitoring locations across covariates (basis dimension of five for each). g (s)accounts t i 1999–2011. for residual spatial variability in the 24-h average values, and We generated estimates of uncertainty in model predictions g(s ) accounts for time-invariant spatial variability across the (i.e., standard errors) on the natural-log scale using methods conterminous US, with both terms specified as spatial bivari- described previously (Paciorek et al. 2009; Yanosky et al. ate thin-plate penalized splines with basis dimension values: 2008). 95% prediction intervals on the natural-log scale were k = I ∗ 0.9 and k =(I − Q) ∗ 0.9, respectively. The site- t t calculated and exponentiated to assess prediction interval specific random effect b represents unexplained site-specific coverage. variability, thus our characterization of the model as a GAMM. As in previous work, we used a two-stage modeling ap- Model validation proach to fit the above model (Eq. 3)(Yanosky et al. 2014). In the first stage (Eq. 4), we estimated site-specific intercepts (u ) 10-fold out-of-sample CV was used to evaluate model pre- adjusting for spatio-temporal covariates and residual spatial dictive accuracy and thereby inform covariate selection. variability in the 24-h average values (N = 219,300). The Monitoring sites were selected at random and assigned Air Qual Atmos Health (2018) 11:741–754 747 exclusively to one of 10 sets. Since the covariate selection removed. We also evaluated whether the direction of the process involved fitting multiple candidate models to the smooth functions met a priori expectations; however, no co- same data, set 10 was reserved (i.e., not used for model variates were removed for this reason. fitting) to assess whether the covariate selection process contributed to over-fitting (Draper and Krnjacic 2006). Data from sets one to nine were removed from the data set Results sequentially, with the model fit to the remaining data. Model predictions were then generated and compared to Sensitivity of ADMS-roads output to changes in input the left-out observations. parameters The predictive accuracy of the spatio-temporal PM EC 2.5 model levels was determined from the squared Pearson Wind direction, which determines whether the receptor point correlation between the left-out observations and model is downwind of the source, was found to be the most influen- predictions (CV R ), with both back-transformed to the native tial ADMS-Roads input parameter. The next most influential rather than the natural-log scale. Weekly, monthly, and annual input parameters, evaluated at an arbitrary point 100-m down- averages were calculated after averaging left-out observations wind of the source, were, in order of decreasing influence: and model predictions to the corresponding averaging time. wind speed, surface roughness, sensible heat flux, planetary Spatial CV R values were calculated similarly using the boundary layer height, absolute value of minimum L ,and MO means of the 24-h average values across the entire time period air temperature. These findings were consistent with guidance from 1999 to 2011. Prediction errors were calculated by in the ADMS-Roads User Guide. subtracting left-out observations from the model predictions. Bias in model predictions was determined using the normal- Results of smoothing traffic intensity ized mean bias factor (NMBF) (Shaocai Yu, personal communication) and the slope from major-axis linear The spatial smoothing GAMs of traffic intensity fit the regression (Legendre 2011) of the left-out observations Dynamap traffic count data reasonably well, as evidenced by against the model predictions (both on the natural-log scale). model R values of 0.834, 0.638, 0.533, and 0.590 for A1–A4 The precision of model predictions was obtained by taking the roads, respectively (Table 2). mean of the absolute value of the prediction errors (CVMAE) Spatial smoothing GAMs showed generally similar spatial and using the normalized mean error factor (NMEF) (Shaocai trends across road classes, each showing increased traffic in- Yu, personal communication). Formulas for the NMBF and tensity in major urban areas. However, slightly different spa- NMEF are provided in the Supplemental Material. Bias and tial patterns were evident (Supplemental Material Fig. S4), precision values from CV were evaluated overall, and by US Census Division, tertiles of urban land use, season, monitoring Table 2 Results of the external validation of the spatial prediction network, and monitoring objective. Two thousand seventy- models of Dynamap traffic counts using ATR and WIM station traffic three extreme values of PM EC above the 99th percentile 2.5 count data from the US DOT Bureau of Transportation Statistics’ −3 of 1850 ng m were excluded as outliers prior to calculating A National Transportation Atlas Database 2011 CV statistics. b 2c Road class n Validation R Model development and covariate selection Spatial smoothing 100-km kernel GAM density To determine the best parsimonious prediction model, we fit A1 1712 0.524 0.316 spatio-temporal models of 24-h average PM EC levels with 2.5 A2 1694 0.492 0.313 alternate versions of ADMS-Roads output (A1, A2, A3 and A3 1368 0.305 0.189 A4/A6 vs. A1–A3 only; Bbest-estimate^ vs. Bworst-case downwind^), with and without various meteorological param- ATR refers to the BAutomatic Traffic Recorders^ data; WIM refers to the eters, and without ADMS-Roads output but instead with GIS- BWeigh-in-Motion^ data; DOT is Department of Transportation based spatial covariates for distance to nearest road by road Data for A4 class roads was not available due to the limited number (9) class. The initial set of meteorological covariates was based on and spatial coverage of stations on these roads our earlier work (Yanosky et al. 2014). We compared the Obtained from linear regression of predicted traffic counts using spatial CV R among these models, and also among those with Dynamap data and either the spatial smoothing GAM approach or the 100-km kernel density approach versus combined ATR and WIM traffic and without GIS-based spatial covariates for county-level counts in the external validation data set. Data from ATR or WIM stations population density, urban land use, and elevation in the with traffic count values beyond the range of the Dynamap traffic count second-stage model. Covariates which were not statistically data for each road class were removed because they were likely not significant (p > 0.05) using the Wald tests (Wood 2006)were spatially matched to the correct road class 748 Air Qual Atmos Health (2018) 11:741–754 particularly with regard to the extent of influence of major road^ monitors (those with an A1, A2, or A3 road within cities in surrounding suburban areas. Additional details re- 300 m, which represented about 47% of all monitors), these garding spatial trends in traffic intensity and performance of differences were more evident (CV R of annual averages of the two spatial averaging methods can be found in the 0.762 for ADMS-Roads output summed across A1–A3 roads Supplemental Material. vs. 0.689 for ADMS-Roads output summed across A1, A2, A3, and A4/A6 vs. 0.663 for distance to road covariates). Validation of ADMS-Roads traffic intensity input data Results also showed that the spatio-temporal model that included Bbest-estimate^ ADMS-Roads output performed on- Validation R values from linear regression of the combined ly slightly better than one with Bworst-case downwind^ ATR and WIM traffic counts using the spatial smoothing ADMS-Roads output when comparing long-term averages (spatial CV R = 0. 784 vs. 0.775, respectively). In contrast, GAM predictions of the Dynamap traffic count data were 0.524, 0.492, and 0.305 for road classes A1–A3, respectively on the 24-h average level, the model with Bworst-case downwind^ ADMS-Roads output performed well and even (Table 2). We also regressed results from the kernel density within 100-km (displayed in Supplemental Material Fig. S4 slightly better than the Bbest-estimate^ ADMS-Roads output panels A1c, A2c, A3c, and A4c) approach against the com- (24-h average CV R = 0.539 for Bworst-case downwind^ vs. bined ATR and WIM traffic count data; validation R values 0. 532 for Bbest-estimate^). Despite the influence of daily were lower as compared to the spatial smoothing GAM ap- prevailing wind direction on ADMS-Roads output, these proach at 0.316, 0.313, and 0.189, for road classes A1–A3, models were quite comparable with respect to predictive ac- respectively (Table 2). In all six regression models discussed curacy of PM EC levels, suggesting that our 24-h prevailing 2.5 above, each of the predicted traffic intensity terms was highly wind direction data from MERRA is only crudely capturing statistically significant (p < 0.001) when regressed against the the influence of wind direction; modeling hourly wind direc- tion may provide improvement but is currently impractical at combined ATR and WIM traffic count data for the corre- sponding road class. the national scale. Based on the results above, the final prediction model used Validation of ADMS-Roads output using Bbest-estimate^ ADMS-Roads output from only road classes A1–A3. However, the model with Bworst-case downwind^ spatio-temporal models of 24-h average PM EC 2.5 ADMS-Roads output would provide an acceptable alternative to the Bbest-estimate^ model, especially for analyses focused Predictive performance of spatio-temporal models of 24-h average PM EC on shorter averaging time periods. Also, we note that excluding 2.5 A4 roads results in a large gain in computational efficiency. Results from our model comparisons showed that the spatio- Though predictive accuracy was moderate for PM EC at 2.5 temporal model that included ADMS-Roads output summed the 24-h average level (Table 3; 24-h average CV R =0.532), it improved substantially for longer averaging times (CV R = across only A1–A3 roads exhibited slightly higher predictive accuracy than one that included ADMS-Roads output 0.613, 0.679, 0.707, 0.795, and 0.784 for weekly, two-week, and monthly, annual-average, and spatial averages, respective- summed across A1, A2, A3, and A4/A6 roads (Table 3; spatial CV R = 0.784 vs. 0.753, respectively). This model ly. Predictive accuracy was generally consistent across US Census Divisions (Supplemental Material Table S1). It was also performed better than the simpler model that used distance to nearest road covariates for A1–A3 roads also reasonably consistent across tertiles of urban land use, across the four seasons, and across monitoring networks. (Table 3;spatial CV R = 0.753). Among only Bnear-major- Table 3 Cross-validation statistics for the spatio-temporal prediction model (top row) and two alternate models (lower two rows) of 24-h average PM EC levels in the conterminous US from 1999 to 2011 2.5 Spatio-temporal model 24-h average values CV R of Spatial a 2f covariates annual CV R 2 d d e f n Model CV R Intercept Slope NMBF CVMAE NMEF averages 2B C E E R (%) (%) ADMS-Roads output (A1–A3) 201,425 0.679 0.532 1.4 0.74 − 4.8 112 47.1 0.795 0.784 ADMS-Roads output 201,425 0.682 0.495 1.3 0.75 − 3.7 115 47.9 0.747 0.753 (A1–A4 and A6) No ADMS-Roads output 201,425 0.683 0.472 1.3 0.76 − 4.0 117 48.7 0.723 0.753 but withdistancetoroad (A1–A3) terms Air Qual Atmos Health (2018) 11:741–754 749 Fig. 2 Plot of the smooth function of ADMS-Roads output (summed across road classes A1– A3) from the spatio-temporal PM2.5 EC model levels. Data on both axes have been transformed from the natural log to the native scale. The x-axis shows the distribution of ADMS-Roads output (summed across road classes A1–A3), at the set of monitors used for model fitting, in red Across monitoring objectives, predictive accuracy was slight- total rainfall. Plots of each of these smooth functions are ly lower at Bsource-related^ sites (categories defined by the shown in the Supplemental Material Fig. S5. Also, details US EPA; Bsource-related^ refers to a monitor located nearby regarding the df used by the spatial terms in the spatio- to a large point source; CV R of annual averages = 0.398). temporal PM EC model can be found in the Supplemental 2.5 95% prediction interval coverage was 0.969, as compared to a Material. nominal value of 0.95, indicating that model standard errors though reasonably well scaled, may have been slightly Spatial patterns in spatio-temporal PM EC 2.5 underestimated. model-predicted levels Smooth function of ADMS-Roads output Figure 3 shows predicted 24-h average PM EC levels for 2.5 in the spatio-temporal PM EC model 2 days in 2008 and averaged across that year. Predicted PM 2.5 2.5 EC levels are clearly increased near roadways in each plot. In Fig. 2, the smooth function for ADMS-Roads output Figure 4 shows predicted 24-h average PM EC levels on 2.5 summed across A1–A3 roads (shown on the x-axis) increases a 6-km grid across the conterminous US for August 4 and 10, monotonically and nearly linearly from zero to moderate 2008, and averaged across 2008. Additional details regarding −3 levels around 500 ng m , and then begins to flatten. Once these figures and the statistical assumptions of our spatio- −3 the ADMS-Roads output exceeds 1000 ng m ,PM EC temporal PM EC model can be found in the Supplemental 2.5 2.5 levels again rise, but with a shallower slope. Material. Smooth functions of other covariates in the spatio-temporal PM EC model Discussion 2.5 PM EC levels increased linearly with county-level popula- Comparing the two methods of spatial averaging traffic 2.5 tion density (1.0° of freedom (df)). For urban land use, PM counts, we found that spatial smoothing performed better than 2.5 EC levels increased monotonically but not linearly (4.0 df), the kernel density approach in an external validation using increasing steadily from 0 to about 40% urban land use, less automated traffic recorder data. This is likely due to the dif- steeply from 40 to about 80%, and more steeply from about 80 ferent assumptions regarding sparsity of data in space and the to 100%. For elevation, PM EC levels decreased nearly manner in which each are affected by extreme values. 2.5 linearly from 0 to about 2000 m, and less steeply beyond. For the spatio-temporal PM EC model, the model R of 2.5 PM EC levels decreased nearly linearly with increasing 0.679 indicates the model was able to explain much of the 2.5 wind speed, increased in an S-shaped pattern with increasing variability in measured levels. Local influences that may not air temperature, and decreased non-linearly with increasing be fully described by the spatio-temporal model (i.e., those not 750 Air Qual Atmos Health (2018) 11:741–754 −3 Fig. 3 a Predicted 24-h average PM EC levels (ng m ), in the same average levels for 2008. 5th to 95th percentiles are shown for each. The 2.5 domain as Fig. 1, for August 4, 2008. b Same as A but showing 24-h prevailing wind direction from smoothed MERRA data is shown with average levels for August 10, 2008. c Same as A but showing annual- black arrows in a and b described by model covariates for traffic-related PM levels, those with distance to road terms instead of ADMS-Roads urbanization, elevation, and local meteorology) include local model output and even among only Bnear-major-road^ moni- wildland fires, residential wood burning, and industrial tors, suggests that micro-scale spatial gradients are a relatively sources, all sources of EC. The spatio-temporal PM EC small proportion of PM EC mass (albeit an arguably impor- 2.5 2.5 model does well in representing spatial trends, and especially tant one for health effects analyses). Also, spatial error induced so for averaging times longer than 24-h (CV R for monthly by smoothing sparse traffic count data or imprecision in other and annual averages = 0.707 and 0.795, respectively; spatial ADMS-Roads model inputs may contribute to similarity 2 2 CV R =0.784). among CV R values. Specifically, including the effect of wind The smooth function for ADMS-Roads output (summed direction from the 10-m road source points to the receptor across road classes A1–A3) showed a non-linear but mono- locations improved prediction accuracy as assessed by the tonically increasing relationship between that covariate and spatial CV R only slightly as compared to the Bworst-case 24-h average PM EC levels, highlighting the need to use downwind^ ADMS-Roads output, suggesting that MERRA 2.5 penalized spline smoothing. By including PM EC monitor- 24-h prevailing wind direction data (one value per day) are 2.5 ing data from all sites in the conterminous US in the same of only limited use in describing primary traffic-related PM spatio-temporal model, we capture the largest possible range levels. Though it would represent a substantial increase in in the ADMS-Roads output covariate. computational demand, hourly wind data, which would reflect The similarity in CV results from the spatio-temporal changes in wind direction throughout the day, would likely be models with alternative representations of traffic, including more realistic and may further increase predictive accuracy. Air Qual Atmos Health (2018) 11:741–754 751 −3 Fig. 4 a Predicted 24-h average PM EC levels (in ng m ) on a 6-km c Same as a but showing annual-average levels for 2008. White points are 2.5 grid in the conterminous US (219,948 cell values each day) for August 4, PM EC monitors. 5th to 95th percentiles are shown for each 2.5 2008. b Same as a but showing 24-h average levels for August 10, 2008. In comparison to previous studies, our methodology is sim- from the conterminous US in one spatio-temporal model and ilar to that used in the London Air Pollution Toolkit (Kelly by using spatially smooth covariates, with the exception of et al. 2011) and in Wilton et al. (2010). However, our meth- ADMS-Roads output. Though the ADMS-Roads output does odology is novel in (1) its scope of application, in that it is the contain spatial discontinuities where the traffic intensity first, to our knowledge, to apply a line-source Gaussian plume changes categories, these were quite small (one is visible in model to estimate traffic-related PM levels in a consistent the bottom right of each panel of Fig. 3;thisisanextreme manner across the conterminous US, and (2) its use of spatial- example due to the very high traffic intensity in the area ly smoothed road-class-specific traffic counts. In comparison shown). However, we do assume stationarity across the do- to previous studies, the spatial distribution of long-term mean main. In contrast, the moving-window approach in Akita et al. predicted PM EC in Fig. 4 is broadly similar to that present- (2012) does not require this assumption. 2.5 ed in Bergen et al. (2013). Also, consistent with results in Maintaining computational feasibility was a key challenge Wilton et al. (2010), we found that including ADMS-Roads in the development of our traffic-related PM modeling ap- output on smaller, lesser trafficked A4/A6 roads did not im- proach. Even the spatial join to identify the 10-m road source prove predictive accuracy of the spatio-temporal PM EC points within 2 km of each PM monitoring location was 2.5 2.5 model. computationally demanding, requiring specialized multipro- One advantage of our spatio-temporal model is it avoids cessing software code to be implemented. Categorization of spatial discontinuities in predicted levels, such as those caused the several ADMS-Roads inputs was necessary to make com- by moving-windows in Akita et al. (2012) and the use of putation possible, and to provide a balance of complexity and regional terms in Yanosky et al. (2014), by including data computational feasibility. 752 Air Qual Atmos Health (2018) 11:741–754 Our models will be used to provide estimates of exposure adoption of diesel particulate filters for post-2007 heavy-duty to traffic-related PM and to PM EC in epidemiologic anal- compression ignition engines used in on-road (highway) 2.5 yses. Despite advantages over simpler exposure metrics, our trucks and busses and reduced sulfur content in diesel fuel traffic-related PM modeling approach has several limitations. have and will continue to reduce particulate emissions from One is that annual-average traffic counts do not explain all of new vehicles. the daily variability in traffic intensity; another is that it relies This work emphasizes the need for spatially accurate, con- on spatially averaged traffic counts rather than data on each sistently measured, nationwide, and time-resolved informa- specific road segment. Finely time-resolved (e.g., daily) traffic tion on traffic intensity for each road segment (preferably counts for each road segment in the road network are not distinguishing between light-and heavy-duty vehicles). Also, currently available. To some extent, this is addressed by the siting of additional monitoring sites near existing PM 2.5 performing spatial smoothing separately for each road class. speciation monitors but in locations nearer to large roadways, However, traffic count data are spatially incomplete in that not including those with Bpopulation exposure^ and every segment has a traffic count, so error from spatial inter- Bbackground^ monitoring objectives, would be valuable for polation is unavoidable. Other spatial averaging methods are future development of predictive models of traffic-related air also vulnerable to this problem, such as when imputing the pollutant levels. mean traffic count based on county boundaries. We note that though our traffic intensity inputs vary smoothly in space, the resulting ADMS-Roads output is not smooth in space due to Conclusions our application of kernels to the 10-m road source points; therefore, micro-scale detail, down to the resolution of a few Our approach provides estimates of primary traffic-related PM meters, is retained. Another limitation is that the approach levels based on line-source Gaussian plume dispersion model does not account for bridges, street canyons, tunnels, or other output. The methodology can provide high spatial resolution elevation-related effects or for differences in emission factors (interpolated from a 2-m grid) estimates of traffic-related PM between light-duty and heavy-duty vehicles, the latter because (both Bbest-estimate^ values that incorporate wind direction traffic counts segregated by light- and heavy-duty vehicles and Bworst-case downwind^ values) across the conterminous were unavailable. Finally, meteorological data were averaged US from 1999 to 2011 which account for (1) spatially varying to the daily level and, as discussed above for wind direction, traffic intensity (specific to each road class) and (2) spatio- do not reflect hourly variation in meteorological parameters. temporal covariates describing daily local meteorology. In Other minor limitations include small discontinuities induced contrast, distance to road terms, often used to represent near- by categorization of the input parameters and use of assumed road traffic-related air pollution levels, do not account for road widths and vehicle speeds. However, the inputs were differences in traffic intensity (or at least do not do so within finely divided into a large number of categories to minimize a given road class) or daily local meteorology. Our results spatial discontinuities, and assumed road widths and vehicle demonstrate that ADMS-Roads output describes near-road speeds are likely most influential within 10–20 m of the road primary traffic-related PM levels with greater detail and pre- centerline, an area where relatively few residences exist. dictive accuracy as compared to distance to road terms in a Finally, despite the simplifying assumptions discussed above, spatio-temporal PM EC model; this model had moderate 2.5 the approach presents significant computational challenges prediction accuracy at the 24-h average level (CV R = with respect to processing time, memory, and disk space and 0.532), but better performance when longer-term averages requires parallel computing in order to be implemented on were evaluated (CV R of annual averages = 0.795 and spatial large numbers of locations. CV R = 0.784). Caution is warranted when using ADMS- Our spatio-temporal PM EC model also has several lim- 2.5 Roads output as a separate exposure metric because of the itations. One is that the data set does not contain multiple non-linear shape of the smooth function in the spatio- monitors within small areas with monitors near and far from temporal PM EC model levels (Fig. 2). The spatio- 2.5 major road, data that could help distinguish micro-scale ef- temporal model effectively calibrates the ADMS-Roads out- fects, including that of wind direction, from larger spatial scale put to measured PM EC levels; predictions from it represent 2.5 regional effects. Also, the model makes several statistical as- an improved exposure metric for epidemiologic analyses of sumptions (discussed in the Supplemental Material), such as PM as compared to distance to road terms. stationary and isotropic spatial variation, additive covariate effects, and independent and normally distributed model re- Acknowledgements We thank Bin Wang for her contribution to the siduals with mean zero and constant variance. Finally, a lim- Python programming of several custom Python scripts. itation of all models of exposure to combustion-produced PM is the combination of continually evolving engine mod- 2.5 Authors’ contributions JDY led and directed the study, performed the ifications, after-treatment technologies, and fuel blends. The analysis, and was the lead writer. JF contributed to the analysis, reviewed Air Qual Atmos Health (2018) 11:741–754 753 the manuscript, and participated in revisions. DL reviewed the manu- Grioni S, Krogh V, Tsai MY, Ricceri F, Sacerdote C, Galassi C, script. DR reviewed the manuscript. RVW reviewed the manuscript. Migliore E, Ranzi A, Cesaroni G, Badaloni C, Forastiere F, WG reviewed the manuscript. RCP led the parent study, contributed Tamayo I, Amiano P, Dorronsoro M, Katsoulis M, Trichopoulou meaningful intellectual ideas, supervised JF, and reviewed the manu- A, Brunekreef B, Hoek G (2014) Effects of long-term exposure to script. All authors read the manuscript, provided input, and approved air pollution on natural-cause mortality: an analysis of 22 European the version as submitted. cohorts within the multicentre ESCAPE project. Lancet 383(9919): 785–795 Benson PE (1992) A review of the development and application of the Funding This research was supported by National Institutes of Health CALINE3 and CALINE4 models. Atmos Environ B Urban Atmos grant #1 R01 ES019168. 26(3):379–390 Bergen S, Sheppard L, Sampson PD, Kim S, Richards M, Vedal S et al Compliance with ethical standards (2013) A national prediction model for PM component exposures 2.5 and measurement error–corrected health effect inference. Environ Ethics approval and consent to participate This study was part of the Health Persp 121(9):1017–1025 SEARCH For Air project. Informed consent was obtained from study Brochu P, Kioumourtzoglou MA, Coull BA, Hopke PK, Suh HH (2011) subjects and approval was obtained from the IRB’s of both the Development of a new method to estimate the regional and local University of Maryland and the Penn State College of Medicine. contributions to black carbon. Atmos Env 45(40):7681–7687 Brugge D, Lane K, Padró-Martínez LT, Stewart A, Hoesterey K, Weiss D, Consent for publication Not Applicable. Wang DD, Levy JI, Patton AP, Zamore W, Mwamburi M (2013) Highway proximity associated with cardiovascular disease risk: the influence of individual-level confounders and exposure misclassifi- Conflict of interest The authors declare that they have no competing cation. Environ Health 12(1):84 interests. R Development Core Team (2009) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Abbreviations ADMS, Atmospheric Dispersion Modeling System; Vienna, Austria. ISBN 3-900051-07-0 Available at: http://www.R- ATR, automated traffic recorders; CV, cross-validation; EC, elemental project.org carbon; IMPROVE, Interagency Monitoring of Protected Visual Draper D, Krnjacic M (2006) Bayesian model specification. In: Technical Environments; GAM, generalized additive model; GAMM, generalized report, Dept. Univ. California Santa Cruz, Applied Mathematics and additive mixed model; GIS, geographic information system; L , MO Statistics Monin-Obhukov length; MERRA, Modern Era Reanalysis for Research Eckel SP, Berhane K, Salam MT, Rappaport EB, Linn WS, Bastain TM, and Applications; NO , nitrogen oxide; NO , nitrogen dioxide; PM, par- x 2 Zhang Y, Lurmann F, Avol EL, Gilliland FD (2011) Residential ticulate matter; PM , PM < 2.5 μm; PM ,PM <10 μm; USGS, US 2.5 10 traffic-related pollution exposures and exhaled nitric oxide in the Geological Survey; WIM, Weigh In Motion Children’s Health Study. Environ Health Persp 119(10):1472–1477 Gehring U, Heinrich J, Kramer U, Grote V, Hochadel M, Sugiri D et al Open Access This article is distributed under the terms of the Creative (2006) Long-term exposure to ambient air pollution and cardiopul- Commons Attribution 4.0 International License (http:// monary mortality in women. Epidemiology 17(5):545–551 creativecommons.org/licenses/by/4.0/), which permits unrestricted use, Golan R, Ladva C, Greenwald R, Krall JR, Raysoni AU, Kewada P, distribution, and reproduction in any medium, provided you give Winquist A, Flanders WD, Liang D, Sarnat JA (2018) Acute pul- appropriate credit to the original author(s) and the source, provide a link monary and inflammatory response in young adults following a to the Creative Commons license, and indicate if changes were made. scripted car commute. Air Qual Atmos & Health 11:123–136 Gryparis A, Coull BA, Schwartz J, Suh HH (2007) Semiparametric latent variable regression models for spatiotemporal modelling of mobile source particles in the greater Boston area. Journ Royal Stat Soc Ser References C Appl Stat 56:183–209 Hart JE,LiaoX,HongB,Puett RC, YanoskyJD, SuhH, Akita Y, Chen J-C, Serre ML (2012) Moving-window BME: estimation Kioumourtzoglou MA, Spiegelman D, Laden F (2015) The associ- of PM annual averages. J Exp Sci Env Epi 22(5):496–501 ation of long-term exposure to PM on all-cause mortality in the 2.5 2.5 Nurses’ Health Study and the impact of measurement-error correc- Anderson JO, Thundiyil JG, Stolbach A (2012) Clearing the air: a review tion. Environ Health 14(1):38 of the effects of particulate matter air pollution on human health. J Med Toxicol 8:166–175 Heinrich J, Thiering E, Rzehak P, Kramer U, Hochadel M, Rauchfuss KM et al (2013) Long-term exposure to NO and PM and all-cause Arain MA, Blair R, Finkelstein N, Brook JR, Sahsuvaroglu T, Beckerman 2 10 and cause-specific mortality in a prospective cohort of women. B, Zhang L, Jerrett M (2007) The use of wind fields in a land use Occup Environ Med 70(3):179–186 regression model to predict air pollution concentrations for health Hoffmann B, Moebus S, Mohlenkamp S, Stang A, Lehmann N, Dragano exposure studies. Atmos Environ 41(16):3453–3464 N, Schmermund A, Memmesheimer M, Mann K, Erbel R, Jockel Beelen R, Raaschou-Nielsen O, Stafoggia M, Andersen ZJ, Weinmayr G, KH, for the Heinz Nixdorf Recall Study Investigative Group (2007) Hoffmann B, Wolf K, Samoli E, Fischer P, Nieuwenhuijsen M, Residential exposure to traffic is associated with coronary athero- Vineis P, Xun WW, Katsouyanni K, Dimakopoulou K, Oudin A, sclerosis. Circulation 116(5):489–496 Forsberg B, Modig L, Havulinna AS, Lanki T, Turunen A, Oftedal Jerrett M, Arain A, Kanaroglou P, Beckerman B, Potoglou D, B, Nystad W, Nafstad P, de Faire U, Pedersen NL, Östenson CG, Sahsuvaroglu T, Morrison J, Giovis C (2005) A review and evalu- Fratiglioni L, Penell J, Korek M, Pershagen G, Eriksen KT, Overvad ation of intraurban air pollution exposure models. 2005. Journ K, Ellermann T, Eeftens M, Peeters PH, Meliefste K, Wang M, Exposure Anal Environ Epi 15(2):185–204 Bueno-de-Mesquita B, Sugiri D, Krämer U, Heinrich J, de Hoogh K, Key T, Peters A, Hampel R, Concin H, Nagel G, Ineichen A, Kelly F, Anderson HR, Armstrong B, Atkinson R, Barratt B, Beevers S Schaffner E, Probst-Hensch N, Künzli N, Schindler C, Schikowski et al (2011) The impact of the congestion charging scheme on air T, Adam M, Phuleria H, Vilier A, Clavel-Chapelon F, Declercq C, quality in London. Part 1. Emissions modeling and analysis of air 754 Air Qual Atmos Health (2018) 11:741–754 pollution measurements. Research Report / Health Effects Institute Silverman BW. 1986. Density Estimation for Statistics and Data 155:5–71 Analysis. New York: Chapman and Hall:76. Equation 4.5 Legendre P. lmodel2: Model II Regression. R package version 1.7–0. Stieb DM, Chen L, Eshoul M, Judek S (2012) Ambient air pollution, birth 2011. Available at: http://CRAN.R-project.org/package=lmodel2 weight and preterm birth: a systematic review and meta-analysis. Maynard D, Coull BA, Gryparis A, Schwartz J (2007) Mortality risk Environ Res 117:100–111 associated with short-term exposure to traffic particles and sulfates. US Bureau of Transportation Statistics. 2013. National Transportation Environ Health Persp 115(5):751–755 Atlas web page. Available at: http://www.rita.dot.gov/bts/sites/rita. Medina-Ramon M, Goldberg R, Melly S, Mittleman MA, Schwartz J dot.gov.bts/files/publications/national_transportation_atlas_ (2008) Residential exposure to traffic-related air pollution and sur- database/2011/index.html [accessed November 16, 2013] vival after heart failure. Environ Health Persp 116(4):481–485 US Census Bureau. 2013. 2010 TIGER/Line Shapefiles web page. Paciorek CJ, Yanosky JD, Puett RC, Laden F, Suh HH (2009) Practical Available at: http://www.census.gov/cgi-bin/geo/shapefiles2010/ large-scale spatio-temporal modeling of particulate matter concen- main [accessed February 9, 2013] trations. Ann Appl Stat 3:370–397 US Geological Survey. 2012. National Land Cover Dataset web page. Pelucchi C, Negri E, Gallus S, Boffetta P, Tramacere I, La Vecchia C Available at: http://www.mrlc.gov/nlcd2001.php [accessed June 1, (2009) Long-term particulate matter exposure and mortality: a re- 2012] view of European epidemiological studies. BMC Public Health 9: US Geological Survey. 2013. National Elevation Dataset web page. 453 Available at: http://nationalmap.gov/elevation.html [accessed Puett RC, Hart JE, Schwartz J, Hu FB, Liese AD, Laden F (2011) Are May 19, 2013] particulate matter exposures associated with risk of type 2 diabetes? Weuve J, Puett RC, Schwartz J, Yanosky JD, Laden F, Grodstein F (2012) Environmental Health Persp 119(3):384–389 Exposure to particulate air pollution and cognitive decline in older Puett RC, Hart JE, Yanosky JD, Spiegelman D, Wang M, Fisher J et al women. Arch Intern Med 172(3):219–227 (2014) Particulate matter air pollution exposure, distance to road, Wilton D, Szpiro A, Gould T, Larson T (2010) Improving spatial concen- and lung cancer risk in the Nurses’ Health Study cohort. Environ tration estimates for nitrogen oxides using a hybrid meteorological Health Persp 122:926–932 dispersion/land use regression model in Los Angeles, CA and Rich DQ, Utell MJ, Croft DP, Thurston SW, Thevenet-Morrison K, Seattle, WA. Sci Tot Environ 408(5):1120–1130 Evans KA, Ling FS, Tian Y, Hopke PK (2018) Daily land use Wood SN (2006) Generalized additive models: an introduction with R. regression estimated woodsmoke and traffic pollution concentra- Chapman & Hall/CRC, Boca Raton tions and the triggering of ST-elevation myocardial infarction: a Yanosky JD, Paciorek CJ, Schwartz J, Laden F, Puett R, Suh HH (2008) case-crossover study. Air Qual Atmos & Health 11:239–244 Spatio-temporal modeling of chronic PM exposure for the Nurses’ Schikowski T, Sugiri D, Ranft U, Gehring U, Heinrich J, Wichmann HE, Health Study. Atmos Environ 42:4047–4062 Krämer U (2005) Long-term air pollution exposure and living close Yanosky JD, Paciorek CJ, Laden F, Hart J, Puett R, Suh HH (2014) to busy roads are associated with COPD in women. Respir Res 6: Spatio-temporal modeling of particulate air pollution in the conter- 152 minous United States using geographic and meteorological predic- Shah AS, Langrish JP, Nair H, McAllister DA, Hunter AL, Donaldson K tors. Environ Health 13(1):63 et al (2013) Global association of air pollution and heart failure: a Zhou Y, Levy JI (2007) Factors influencing the spatial extent of mobile systematic review and meta-analysis. Lancet 382(9897):1039–1048 source air pollution impacts: a meta-analysis. BMC Pub Health 7:89

Journal

Air Quality, Atmosphere & HealthSpringer Journals

Published: Jun 1, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off