African ancestry is associated with cluster-based childhood asthma subphenotypes

African ancestry is associated with cluster-based childhood asthma subphenotypes Background: Childhood asthma is a syndrome composed of heterogeneous phenotypes; furthermore, intrinsic biologic variation among racial/ethnic populations suggests possible genetic ancestry variation in childhood asthma. The objective of the study is to identify clinically homogeneous asthma subphenotypes in a diverse sample of asthmatic children and to assess subphenotype-specific genetic ancestry in African-American asthmatic children. Methods: A total of 1211 asthmatic children including 813 in the Childhood Asthma Management Program and 398 in the Childhood Asthma Research and Education program were studied. Unsupervised cluster analysis on clinical phenotypes was conducted to identify homogeneous subphenotypes. Subphenotype-specific genetic ancestry was estimated for 167 African-American asthmatic children. Genetic ancestry association with subphenotypes/ clinical phenotypes were determined. Results: Three distinct subphenotypes were identified: a moderate atopic dermatitis (AD) group with negative skin prick test (SPT) and preserved lung function; a high AD group with positive SPT and airway hyperresponsiveness; and a low AD group with positive SPT and lower lung function. African ancestry at asthma genome-wide association study (GWAS) SNPs differed between subphenotypes (64, 89, and 94% for the three subphenotypes, respectively) and was inversely correlated with AD; each additional 10% increase in African ancestry was associated with 1.5 fold higher in IgE and 6.3 higher odds of positive SPT (all p-values < 0.0001). Conclusions: By conducting phenotype-based cluster analysis and assessing subphenotype-specific genetic ancestry, we were able to identify homogeneous subphenotypes for childhood asthma that showed significant variation in genetic ancestry of African-American asthmatic children. This finding demonstrates the utility of these complementary approaches to understand and refine childhood asthma subphenotypes and enable more targeted therapy. Keywords: Childhood asthma, Cluster analysis, Genetic ancestry, Subphenotypes Background study has been mostly focused on case-control disease Childhood asthma is a heterogeneous chronic airway status. Such an endpoint-based analysis ignores the com- disease with various clinical phenotypes [1, 2]. Its pheno- plexity of asthma phenotype [4–6]. In addition, although typic and biologic heterogeneity contributes to the chal- there is ample evidence for an intrinsic genetic variation lenges clinicians face in its diagnosis and effective among racial/ethnic populations [7, 8] suggesting pos- management [3]. It is therefore crucial to clearly define sible genetic ancestry variation in childhood asthma, subphenotypes of asthma with homogeneous clinical most genetic analyses rely on self-reported race thus do characteristics in order to search for better asthma man- not account for the potential contribution of genetic an- agement and to develop novel therapeutic strategies. Al- cestry to disease variation in diverse populations. though a large number of clinical phenotypes are often An approach to overcome the phenotypic heterogen- collected in childhood asthma studies, asthma genetic eity of childhood asthma is to identify homogeneous subgroups by establishing either classical “endotype”, * Correspondence: tesfaye.mersha@cchmc.org based on experts’ criteria, or statistical phenotype clus- Division of Asthma Research, Department of Pediatrics, Cincinnati Children’s tering on asthma clinical phenotypes. The latter has Hospital Medical Center, University of Cincinnati, 3333 Burnet Ave, Cincinnati, been successfully applied to identify clinically relevant OH 45229, USA Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Ding et al. BMC Medical Genomics (2018) 11:51 Page 2 of 11 subgroups of asthmatics and other airway diseases [9–17]. phs000166.v2.p1). Quality control criteria included However, these studies differ in some key elements: vari- minor allele frequency ≥ 0.05, Hardy-Weinberg equi- − 5 ation in phenotyping, analytical approaches used and the librium (p ≥ 10 ), ≤ 5% missing rate per person, ≤ 5% patient population under study. These differences limit missing rate per SNP, families with less than 5% Men- the comparability of the identified subphenotypes and del errors and SNPs with less than 10% Mendel error pose difficulty in applying clustering results to individual rate [21]. patients. Furthermore, little is understood regarding the genetic ancestry of the identified subphenotypes. The objective of the study is to investigate childhood Hierarchical cluster analysis (HCA) asthma phenotypic heterogeneity and genetic ancestry HCA is a hypothesis free statistical method to group variations and their relationships. Specifically, we used subjects into relatively homogeneous sub-clusters ac- childhood asthma data from the NIH controlled data- cording to similarity quantification based on a set of base of Genotype and Phenotype (dbGaP) to identify critical characteristic variables. The grouping is con- homogeneous subphenotypes, determine clinical pheno- structed such that the similarity is strong between mem- types, estimate subphenotype-specific genetic ancestry, and bers of the same cluster and weak between members of analyze the relationship between ancestry and subpheno- different clusters. The baseline phenotypic measures types using a stepwise approach incorporating cluster ana- listed in Table 1 were included in the cluster analysis. To lysis, classification tree analysis, and genetic ancestry reduce collinearity, we examined the variables for abso- analyses [9–16, 18, 19]. Our goal is to combine both clus- lute correlation (> 0.80). We also assessed missing pat- ter and genetic ancestry to identify biologically-relevant tern of the phenotypes and planned to exclude measures subphenotypes in childhood asthma. with ≥10% missingness from the analysis. Blood eosino- phils (EOS) and IgE were log transformed. Since we have mixed types of variables, i.e., continuous Methods and categorical, Gower’s distance [22] was used as a Data similarity index. To avoid inconsistent cluster solutions The database of Genotypes and Phenotypes (dbGaP) is due to changes in scale of the variables and heavy impact the repository for both genotype and phenotype data from of variables with larger standard deviations, Gower’s most NIH-funded GWAS and other whole-genome or ex- standardization, based on the range, was applied. HCA ome sequence data. We used baseline data from the SNP was then carried out with Ward’s minimum-variance Health Association Resource (SHARe) Asthma Resource method [23]. Consensus between a pseudo F and a Project (SHARP) (phs000166.v2.p1), the National Heart, pseudo t statistics [24, 25] was used to select the num- Lung, and Blood Institute’s clinical research trials on ber of clusters. The number of clusters was also guided asthma, specifically, the Childhood Asthma Management by clinical characteristics in addition to statistical Program (CAMP) and the Childhood Asthma Research considerations. and Education (CARE) network. The CAMP is a multi- Descriptive statistics of all variables were obtained and center, randomized, double-masked clinical trial designed compared across clusters using analysis of variance, to determine the long-term effects of three inhaled treat- Kruskal-Wallis, or Chi-square tests as appropriate. Con- ments for mild to moderate childhood asthma [20]. The ditional inference trees [26], a non-parametric class of CARE data evaluates current and novel therapies and regression trees that embeds tree structured regression management strategies for children with asthma. Individ- models into a well-defined theory of conditional infer- ual level data with asthma diagnosis is available for 1211 ence procedures, was used to identify intermediate phe- subjects through Authorized Access, including 813 in notypes that distinguish the subphenotypes. The cluster CAMP and 398 in CARE. analysis was first carried out on the CAMP data and re- An array of phenotypic variables have been harmo- peated on the CARE data. Replication of the clustering nized across the CAMP and CARE datasets, including results was examined between the two studies as well as demographics and participant characteristics; intermedi- with previously published studies. ate asthma phenotypes such as lung function, skin prick Additional analyses were run to investigate if the sub- test (SPT), serum total immunoglobulin (IgE), and phenotypes were associated with clinical outcomes. Two atopic dermatitis (AD), as well as environmental expos- clinical outcomes were examined, number of prednisone ure. See Table 1 for a complete list of variables. bursts (an anti-inflammatory oral steroid medication) We downloaded CAMP and CARE genotype data since last visit, and number of ER visit or hospitaliza- which were performed using 1 million single nucleotide tions since last visit. Number of prednisone bursts since polymorphisms (SNPs) in the Affymetrix 6.0 chip and last visit was modeled as a count variable using Poisson stored in the database of dbGaP (accession number regression with a random subject effect. Number of ER Ding et al. BMC Medical Genomics (2018) 11:51 Page 3 of 11 Table 1 Demographic, clinical phenotypes and environmental exposures of CAMP and CARE study participants CAMP (N = 813) CARE(N = 398) p-value Age, Mean (SD), years 8.9 (2.1) 10.6 (2.8) < 0.0001 Gender, No. (%) 0.8152 Male 500 (61.5) 242 (60.8) Female 313 (38.5) 156 (39.2) Race, No. (%) < 0.0001 Caucasian 557 (68.5) 215 (54) African American 107 (13.2) 70 (17.6) Hispanic 77 (9.5) 78 (19.6) Other 72 (8.9) 35 (8.8) BMIZ at baseline, Mean (SD) 0.5 (1.0) 0.8 (1.0) < 0.0001 Age of onset , Mean (SD), years 3.0 (2.4) 3.7 (3.3) < 0.0001 FEV1 PC20 meth , Mean (SD), mg/ml 2.0 (2.4) 2.2 (3.1) 0.3602 FEV1 percent predicted , Mean (SD) 93.4 (14.1) 97.1 (12.8) < 0.0001 FVC percent predicted , Mean (SD) 103.7 (13.1) 106.7 (12.2) 0.0002 FEV1/FVC ratio , Mean (SD) 79.6 (8.3) 80.1 (8.0) 0.2937 Bronchodilator percent change , Mean (SD) 10.7 (9.9) 9.4 (8.4) 0.0236 Blood eosinophils, Mean (SD), mm 485.7 (409.2) 408.8 (319.5) 0.0011 IgE, Mean (SD), ng/ml 1129.8 (2081.9) 330.6 (445.4) < 0.0001 Average AM peak flow , Mean (SD), L/min 250.9 (64.4) 271.1 (92.4) < 0.0001 Average AM symptoms , Mean (SD) 0.61 (0.45) 0.51 (0.40) < 0.0001 Environmental smoke , No. (%) 339 (41.7) 166 (41.7) 0.0256 In utero smoke , No. (%) 107 (13.2) 54 (13.6) 0.8060 Atopic dermatitis , No. (%) 199 (24.4) 155 (38.9) < 0.0001 One or more positive SPT , No. (%) 716 (88.1) 312 (78.4) 0.0002 Age at first asthma symptoms The dose of methacholine that is required to decrease FEV1 by 20% Forced expiratory volume, the maximal amount of air one can forcefully exhale in one second converted to a percentage of normal based on one’s height, weight, body composition, and race Forced vital capacity, the amount of air a person can expire after a maximum inspiration second converted to a percentage of normal based on one’s height, weight, body composition, and race Also called Tiffeneau-Pinelli index, is a calculated ratio used in the diagnosis of obstructive and restrictive lung disease. It represents the proportion of a person’s vital capacity that they are able to expire in the first second of expiration Post bronchodilator percent change from baseline: 100*(POSFEV - PREFEV)/PREFEV The maximum flow rate generated during a forceful exhalation, starting from full lung inflation; average of daily measurements up to 4 weeks prior to visit with a minimum of 7 days, recorded in daily diary card Maximum of daily wheezing and coughing then average of daily measurements up to 4 weeks prior to visit with a minimum of 7 days, recorded in daily diary card Either parent smoked during trial or home exposure to smoke prior to trial enrollment Mother smoked when pregnant with participant Child had atopic dermatitis for 2 years and was seen by a doctor for it One or more skin prick test positive visit or hospitalizations since last visit was dichotomized Genetic ancestry analysis (given over 95% of the subjects did not had an ER visit Genetic ancestry was estimated using both or hospitalization), and modeled using a logistic regression genome-wide SNPs and asthma-specific GWAS SNPs with a random subject effect. Potential covariates included for African-American asthmatic individuals in CAMP age, sex, race, visit month, time since last visit, treatment, and CARE. Supervised approach in the ADMIXTURE and subphenotypes that were significantly associated with software program [28] was use to estimated global gen- the outcome (adjusted p-value < 0.05). All analyses were etic ancestry, where SNP data of 108 YRI (Yoruba in run for CAMP and CARE data separately. All the above Ibadan, Nigeria) and 99 CEU (Utah Residents (CEPH) analyses were conducted in SAS version 9.3 (SAS Institute with Northern and Western Ancestry) individuals from Inc., Cary, NC, USA) and R [27]. the 1000 Genomes Project were included as surrogates for Ding et al. BMC Medical Genomics (2018) 11:51 Page 4 of 11 European and African ancestry. The reference populations FEV1 percent predicted and FVC percent predicted and the CAMP/CARE subjects shared 857,127 genetic (0.71) and between FEV1/FVC and maximum broncho- markers across all autosomes, which reduced to 225,374 dilator percent change (− 0.65). No variables had more SNPs after linkage disequilibrium (LD) pruning with win- than 10% of missing values. dow of 50 (kb), 10 kb window shift and a r2 value of 0.2. Asthma GWAS SNPs, 157 in total, were retrieved from HCA identified distinct subphenotypes the GWAS catalog [29] and STRUCTURE software [30] Clustering on CAMP cohort identified distinct was used to estimate African ancestry proportion at subphenotypes asthma GWAS SNPs. CEU and YRI individuals from the Three clusters were identified from CAMP data 1000 Genomes Project were used as parental populations. (Table 2). Members of cluster 1 had a moderate AD rate Correlations between genetic ancestry and the sub- (15.3%) and all but one had negative SPT (99%). This phenotypes derived by clustering and the discriminate group also had the lowest age at baseline, age at onset of factors of the subphenotypes were examined using the asthma, bronchodilator percent change, EOS, IgE level, Kruskal-Wallis test, Wilcoxon rank-sum test, Spearman AM peak flow, and AM symptoms, and highest body correlation coefficient, or linear regression as appropriate. mass index z-sore (BMIZ), PC20, FEV1 percent pre- dicted, and FEV1/FVC ratio. All these characteristics, but BMIZ and AM symptoms, were statistically different Results across the clusters at a significant level of 0.05. This is Participants from CAMP and CARE were different ex- the moderate AD group with negative SPT and pre- cept in sex, exposure to in utero smoking, PC20, and served lung function. FEV1/FVC ratio (Table 1). All pairwise Spearman correl- Members of cluster 2 had a high rate of AD (97.7%) ation coefficients were less than 0.60, except between and all had one or more positive SPT. This group also Table 2 CAMP hierarchical clustering results Cluster 1 (N = 98) Cluster 2 (N = 171) Cluster 3 (N = 544) p-value Age (years) 7.8 (1.9) 8.7 (2.1) 9.2 (2.1) < 0.0001 Gender No. (%) 0.0675 Male 50 (51.0) 105 (61.4) 345 (63.4) Female 48 (49.0) 66 (38.6) 199 (36.6) Race No. (%) 0.0153 Caucasian 82 (83.7) 116 (67.8) 359 (66.0) African American 9 (9.2) 25 (14.6) 73 (13.4) Hispanic 5 (5.1) 12 (7.0) 60 (11.0) Other 2 (2.0) 18 (10.5) 52 (9.6) BMIZ 0.7 (1.0) 0.6 (1.1) 0.5 (1.0) 0.0929 Age of onset (years) 2.4 (2.2) 2.8 (2.2) 3.2 (2.5) 0.0017 FEV1 PC20 meth (mg/ml) 2.9 (2.8) 1.8 (2.2) 2.0 (2.4) 0.0005 FEV1 percent predicted 96.3 (14.5) 95.0 (14.5) 92.4 (13.9) 0.0117 FVC percent predicted 103.5 (14.3) 104.1 (13.6) 103.6 (12.7) 0.895 FEV1/FVC ratio 82.9 (7.0) 80.6 (8.2) 78.7 (8.3) < 0.0001 Bronchodilator percent change 7.3 (6.9) 11.4 (10.1) 11.1 (10.1) 0.0012 Blood eosinophils (mm ) 228.9 (197.9) 579.7 (442.4) 504 (408.8) < 0.0001 IgE (ng/ml) 200.5 (449.1) 1579 (2624.2) 1161 (2022.7) < 0.0001 Average AM peak flow (L/min) 230.9 (55) 249.8 (67.5) 254.8 (64.4) 0.0040 Average AM symptoms 0.52 (0.40) 0.61 (0.46) 0.63 (0.45) 0.100 Environmental smoke No. (%) 42 (42.9) 60 (35.1) 237 (44.1) 0.1291 In utero smoke No. (%) 19 (19.4) 11 (6.4) 77 (14.2) 0.0046 Atopic dermatitis No. (%) 15 (15.3) 167 (97.7) 17 (3.1) < 0.0001 Positive SPT No. (%) 1 (1) 171 (100) 544 (100) < 0.0001 Mean and SD for continuous variables and No. (%) for categorical variables Ding et al. BMC Medical Genomics (2018) 11:51 Page 5 of 11 had the highest EOS and IgE level, and lowest bron- across the clusters at a significant level of 0.05. This is chodilator percent change among the 3 clusters. This the moderate AD group with negative SPT and pre- is the high AD group with positive SPT and airway served lung function similarity identified in CAMP. hyperresponsiveness. Members of cluster 2 had a high rate of AD (98.4%) Members of cluster 3 had the highest age at baseline and one or more positive SPT (95.3%). This group also and age onset of asthma and lowest BMIZ. This group had the highest EOS and IgE level among the 3 clusters. had also the lowest FEV1 percent predicted and FEV1/ This is the high AD asthma group with positive SPT and FVC ratio, and highest AM symptoms. Furthermore, airway hyperresponsiveness similarly identified in CAMP. members of cluster 3 were mostly AD free and all had Members of cluster 3 had the highest age at baseline one or more positive SPT, moderate EOS and IgE levels, and age onset of asthma, were mostly AD free (3.3%) but lower lung function measures and higher AM symp- and all had one or more positive SPT (92.2%), had mod- toms compared to the other clusters. This is the low AD erate EOS and IgE levels, but higher AM symptoms group with positive SPT and lower lung function. compared to the other clusters. This is the low AD group with positive SPT and lower lung function simi- Clustering on CARE cohort replicated the subphenotypes larly identified in CAMP. identified in CAMP Three clusters were identified in CARE (Table 3). Mem- Atopic dermatitis status and SPT distinguished the bers of cluster 1 had a moderate rate of AD (35%) and subphenotypes none of them had a positive SPT. This group also had Conditional inference trees analysis revealed that, in the lowest bronchodilator percent change, EOS, IgE, both CAMP and CARE data, AD and one or more posi- AM peak flow, and lowest AM symptoms. All these tive SPT were the top two variables that best discrimi- characteristics, but the last, were statistically different nated the individuals into the subphenotypes (Fig. 1, Table 3 CARE hierarchical clustering results Cluster 1 (N = 60) Cluster 2 (N = 129) Cluster 3 (N = 209) p-value Age (years) 10.1 (2.4) 10.1 (2.5) 11.0 (3.1) 0.0124 Gender No. (%) 0.1185 Male 30 (50) 77 (59.7) 135 (64.6) Female 30 (50) 52 (40.3) 74 (35.4) Race No. (%) 0.4519 Caucasian 40 (66.7) 67 (51.9) 108 (51.7) African American 8 (13.3) 24 (18.6) 38 (18.2) Hispanic 8 (13.3) 24 (18.6) 46 (22.0) Other 4 (6.7) 14 (10.9) 17 (8.1) BMIZ 0.9 (0.9) 0.8 (1.0) 0.8 (1.0) 0.5920 Age of onset (years) 3.6 (3.5) 3.1 (2.6) 4.1 (3.5) 0.0215 FEV1 PC20 meth (mg/ml) 3.3 (3.3) 1.6 (2.4) 2.3 (3.4) 0.0031 FEV1 percent predicted 97.2 (13.4) 96.3 (13.1) 97.6 (12.5) 0.655 FVC percent predicted 104.7 (10.7) 107.2 (12.5) 106.9 (12.3) 0.378 FEV1/FVC ratio 81.6 (8.5) 79.0 (8.0) 80.4 (7.9) 0.101 Bronchodilator percent change 6.7 (7.4) 9.9 (7.4) 9.8 (9.0) 0.0271 Blood eosinophils (mm ) 245.7 (211.5) 444.4 (322.1) 435.0 (330.2) < 0.0001 IgE (ng/ml) 63.5 (133.9) 424.5 (537.1) 347.4 (430.1) < 0.0001 Average AM peak flow (L/min) 255.4 (68.7) 258.6 (81.0) 283.3 (102.3) 0.0209 Average AM symptoms 0.43 (0.32) 0.50 (0.40) 0.53 (0.42) 0.202 Environmental smoke No. (%) 28 (46.7) 62 (48.1) 104 (49.8) 0.8985 In utero smoke No. (%) 1 (1.7) 18 (14.0) 35 (16.9) 0.0121 Atopic dermatitis No. (%) 21 (35) 127 (98.4) 7 (3.3) < 0.0001 Positive SPT No. (%) 0 (0) 123 (95.3) 189 (92.2) < 0.0001 Mean and SD for continuous variables and No. (%) for categorical variables Ding et al. BMC Medical Genomics (2018) 11:51 Page 6 of 11 Fig. 1 Conditional inference tree analysis of the three subphenotypes. SPT and atopic dermatitis are the top two factors distinguishing the subphenotypes. The prediction accuracy is 95.8% prediction accuracy 95.8%). Given the consistent find- Genetic ancestry proportion varied at asthma GWAS SNPs ings across CAMP and CARE data, we combined the among asthma subphenotypes two datasets and grouped the three clusters individually The three subphenotypes had 15, 49, and 103 African identified in CAMP and CARE into three subpheno- American individuals, respectively. Global African ances- types. One subphenotype was the moderate AD group try proportion varies from 71.2 to 100% with mean with negative SPT and preserved lung function (subphe- 96.6% and standard deviation (SD) 7.2%. Higher global notype 1, n = 158), one was the high AD group with po- African ancestry was associated with AD (mean ± SD of sitive SPT and airway hyperresponsiveness (subphenotype African origin is 0.96 ± 0.08 for AD free vs. 0.98 ± 0.06 2, n = 300), and one was the low AD group with positive for AD subjects, p-value = 0.0294), but not with other SPT and lower lung function (subphenotype 3, n =753). clinical phenotypes. Proportion of African ancestry at asthma GWAS SNPs was correlated with the subpheno- Subphenotypes were associated clinical outcomes types (mean 64.9, 89.4 and 94.4% for subphenotypes 1, Table 4 shows the association between the subpheno- 2, and 3, respectively, p-value < 0.0001, Figs. 2 and 3(a)). types and clinical outcomes. In CAMP data, the incident The subphenotypes were associated with lung function: rate of prednisone bursts since last visit for subpheno- FEV1 percent predicted is 96.8 ± 14.1, 95.3 ± 13.9, and type 2 is 2.63 (1.45, 2.70) times the incident rate for sub- 93.9 ± 13.7 (p-value = 0.0083); and FEV1/FVC ratio is phenotype 1, and the incident rate of prednisone bursts 81.9 ± 7.6, 80.5 ± 8.1, and 79.0 ± 8.2 (p-value < 0.0001) since last visit for subphenotype 3 is 2.04 (1.56, 2.70) for subphenotypes 1, 2, and 3, respectively. Furthermore, times the incident rate for subphenotype 1. Also in African ancestry at asthma GWAS SNPs was inversely CAMP data, the odds of any ER visit or hospitalizations associated with AD (median 0.95 with IQR (0.93, 0.95) since last visit for subphenotype 3 is 1.54 (1.01, 2.23) for AD free vs. 0.92 (0.89, 0.94) for AD subjects, p-value times the odds for subphenotype 1. For CARE data, the < 0.0001, Fig. 3(b)). Additionally, genetic ancestry at odds of any ER visit or hospitalizations since last visit asthma GWAS SNPs was associated with positive SPT for subphenotype 2 is 0.32 (0.13, 0.98) times the odds with median and interquartile range (IQR) 0.94 (0.92, for subphenotype 1, and the odds of any ER visit or hos- 0.95) for positive SPT individuals vs. 0.74 with IQR pitalizations since last visit for subphenotype 3 is 3.45 (0.59, 0.78) for negative SPT individuals (p-value < (1.47, 7.69) times the odds for subphenotype 2. 0.0001, Fig. 3(c)). The odds of one or more positive SPT Ding et al. BMC Medical Genomics (2018) 11:51 Page 7 of 11 Table 4 Association between subphenotypes and number of prednisone bursts and any ER visit or hospitalizations since last visit Number of prednisone bursts since last visit CAMP Predicted number of event Incident rate ratios Subphenotype Estimate (95% CI) p-value Subphenotypes IRR (95% CI) p-value 1 0.10 (0.08, 0.13) < 0.0001 2 vs. 1 2.63 (1.45, 2.70) < 0.0001 2 0.20 (0.16, 0.24) 3 vs. 1 2.04 (1.56, 2.70) < 0.0001 3 0.20 (0.18, 0.22) 3 vs. 2 1.02 (0.83, 1.27) 0.8153 CARE Predicted number of event Incident rate ratios Subphenotype Estimate (95% CI) p-value Subphenotypes IRR (95% CI) p-value 1 0.08 (0.05, 0.14) 0.3534 2 vs. 1 0.93 (0.57, 1.54) 0.7880 2 0.08 (0.05, 0.12) 3 vs. 1 1.19 (0.76, 1.89) 0.4420 3 0.10 (0.07, 0.14) 3 vs. 2 1.28 (0.90, 1.82) 0.1666 Any ER visit or hospitalizations since last visit CAMP Predicted probability Odds ratios Subphenotype Estimate (95% CI) p-value Subphenotypes OR (95% CI) p-value 1 0.03 (0.02, 0.04) 0.1232 2 vs. 1 1.52 (0.95, 2.44) 0.0776 2 0.04 (0.03, 0.05) 3 vs. 1 1.54 (1.01, 2.33) 0.0434 3 0.04 (0.03, 0.04) 3 vs. 2 1.01 (0.75, 1.37) 0.9474 CARE Predicted probability Odds ratios Subphenotype Estimate (95% CI) p-value Subphenotypes OR (95% CI) p-value 1 0.02 (0.01, 0.05) 0.0155 2 vs. 1 0.35 (0.13, 0.98) 0.0458 2 0.01 (0.004, 0.02) 3 vs. 1 1.20 (0.56, 2.63) 0.6296 3 0.03 (0.02, 0.04) 3 vs. 2 3.45 (1.47, 7.69) 0.0039 Fig. 2 Population ancestry estimates of African American asthmatic individuals in CAMP and CARE at asthma GWAS SNPs by subphenotypes. Dashed lines indicate average proportions of African ancestry proportion at the asthma GWAS SNPs. Ibadan, Nigeria (YRI) and northern and western European (CEU) from the 1000 Genomes project were used as parental populations Ding et al. BMC Medical Genomics (2018) 11:51 Page 8 of 11 Fig. 3 Boxplots and scatterplot of proportions of African ancestry at the asthma GWAS SNPs by: a subphenotypes, b Atopic dermatitis status, c SPT, and d IgE levels was 6.3 higher (95% confidence interval: (3.4, 13.8), knowledge, our study is the first to show the association p-value < 0.0001) with each additional 10% of African between genetic ancestry at asthma GWAS SNPs and origin at asthma GWAS SNPs. African origin at asthma cluster-based subphenotypes in childhood asthma. Lever- GWAS SNPs was also associated with IgE levels (Spear- aging ancestry and cluster analyses to derive genetic and man correlation coefficient = 0.27, p-value = 0.0004) and phenotypic homogeneity subgroups in childhood asthma IgE was 1.5 fold higher with each additional 10% of Afri- demonstrates the utility of these approaches to can origin (Fig. 3(d)). characterize and understand the complexity of asthma to- wards individual based precision medicine strategies. Discussion This study demonstrates that genetic ancestry at Current clinical practice in childhood asthma treatment asthma GWAS SNPs is more strongly associated with tends to use average patient care strategies. Such a “one asthma subgroups sharing similar clinical characteristics size fits all” treatment approach faces major challenges compared to broadly defined asthma. The results suggest when it is becoming clearer that childhood asthma is that validation of genetic studies are more likely to be heterogeneous in pathogenesis. Our unbiased cluster successful for replication studies carried-out in more and genetic ancestry analyses pointed toward three dis- homogeneous asthma cohorts (sharing similar clinical char- tinct phenotypic clusters with differences in clinical acteristics) compared to the multifactorial case-control sta- characteristics, genetic ancestry, and clinical outcomes, tus. In addition, the results indicate that ancestry-specific underscoring the clinical and genetic heterogeneity of genetic lociof asthmaare likelytobefound by focusing on asthma [10, 13, 17, 31]. Previous studies have also identi- better defined asthma patients. Furthermore, genetic ances- fied clusters with atopic or non-atopic asthma, clusters try analysis in homogeneous asthma subgroups is suitable with preserved or lower lung function, and clusters with to refine the biological role of asthma susceptibility variants mild asthma [13, 14, 32]. It is reassuring that the two in- from GWAS studies in a given phenotype. For example, dependent studies replicated the clustering results and SNPs at STARD3/PGAP3 arestronglyassociatedwiththe there are similarities with previous clustering-based high atopic dermatitis subgroup suggesting that STARD3/ childhood asthma subphenotypes. PGAP3 may act on the allergic component of asthma [43]. We determined genetic ancestry [33] using genome-wide Another example is that ORMDL3/17q locus is associated SNPs and asthma GWAS SNPs for African-American with asthma in multiple studies in the European ancestry asthmatic individuals in CAMP and CARE data. Our esti- but not in African ancestry asthmatic individuals [44]. We mate of African global ancestry in asthmatic children is also investigated associations between asthma GWAS SNPs higher than what has been reported in different general with the identified subphenotypes in CAMP and CARE populations confirming the higher prevalence of asthma in data (methods and results in Additional file 1:Table S1). individuals with higher African ancestry than others. Our Several significant associations were identified at p =0.05, results showed that genetic ancestry at asthma GWAS but none after multiplicity adjustment, possibly due to SNPs differed between the childhood asthma subpheno- smallsamplesizeand limitedstatistical power. types and was associated with lung function, SPT, IgE Our study had several limitations. First, participants in levels, and AD. Previous studies have also showed associ- CAMP and CARE represent studies of childhood ation between genetic ancestry and asthma prevalence and asthma, thus the results herein may not be applicable to related clinical phenotypes [34–42]. To our best adulthood asthma. Second, although we identified Ding et al. BMC Medical Genomics (2018) 11:51 Page 9 of 11 clinically relevant subphenotypes of childhood asthma childhood asthma using cluster analysis. Further genetic using clinical phenotypes [45], the integration of this re- ancestry analysis showed correlations between African sult with molecular and physiologic phenotyping may help ancestry at asthma GWAS SNPs and childhood asthma to better understand childhood asthma pathogenesis for subphenotypes and related clinical outcomes. Our re- possibly more personalized therapeutic strategies. Further- sults demonstrated that cluster analyses on clinical phe- more, subgroup analyses of asthma may limit sample sizes notypes followed by ancestry analysis can enhance the and impair statistical power. However, given asthma is a understanding of the phenotypic and genetic heterogen- highly heterogeneous phenotype, studying homogeneous eity of childhood asthma. Our approach is distinct from subgroups of asthma patients not only recovers power previous efforts in that we developed cluster-based sub- limitation, but achieves more statistically significant re- phenotype and applied genetic ancestry analysis to de- sults. Classifying asthma patients in more homogenous fine subphenotype-ancestry relationships which can be groups may help us to identify new susceptibility or modi- subsequently used as the basis of genetic ancestry based fying subphenotype-specific genes. Our ability to better clinical risk prediction. Our findings suggest that defin- define subtypes might help to predict who may respond to ing asthma heterogeneous subgroups on the basis of treatment vs subjects who may not. Future studies need to clinical phenotypes and genetic ancestry proportion is elucidate the mechanisms that distinguish each ancestral an essential step to understand and refine patient sub- and clinical clusters to facilitate the development of tar- sets and enable more targeted therapy. geted therapies and providing personalized treatments. The present study has notable strengths. First, we were Additional file able to dissect childhood asthma heterogeneity into sub- Additional file 1: Table S1. Association between asthma GWAS SNPs phenotypes using cluster analysis of clinical phenotypes and subphenotypes. This file contains association results between asthma in one study and replicate the findings in an independ- GWAS SNPs with the identified subphenotypes in CAMP and CARE data. ent study. Second, we were able to show associations be- (DOCX 24 kb) tween the identified subphenotypes with asthma clinical outcomes. Third, analysis of genetic ancestry at asthma Abbreviations GWAS SNPs in childhood asthma clinical phenotypes AD: Atopic dermatitis; BMIZ: Body mass index z-sore; CAMP: the Childhood Asthma Management Program; CARE: the Childhood Asthma Research and provide biologically relevant subphenotype-specific re- Education network; CEU: Utah Residents with Northern and Western sults. Lastly, our study used a more accurate and direct Ancestry; dbGaP: The database of Genotypes and Phenotypes; EOS: Blood assessment of genetic ancestry instead of self-reported eosinophils; FEV1: Forced expiratory volume, the maximal amount of air one can forcefully exhale in 1 s converted to a percentage of normal based on race to determine the relationship between ancestry and one’s height, weight, body composition, and race; FVC: Forced vital capacity, childhood asthma subphenotypes and relevant clinical the amount of air a person can expire after a maximum inspiration second phenotypes. Studies have shown that people with the converted to a percentage of normal based on one’s height, weight, body composition, and race; GWAS: Genome-wide association study; same self-reported race could have drastically different HCA: Hierarchical cluster analysis; IgE: Serum total immunoglobulin; levels of genetic ancestry, and self-reported race may not IQR: Interquartile range; LD: Linkage disequilibrium; PC20: The dose of be as accurate as direct assessment of genetic ancestry methacholine that is required to decrease FEV1 by 20%; SD: Standard deviation; SHARe: The SNP Health Association Resource; SHARP: The SNP in predicting treatment outcomes [33]. Future studies to Health Association Resource Asthma Resource Project; SNP: Single nucleotide identify genetic ancestry-specific variants associated with polymorphism; SPT: Skin prick test; YRI: Yoruba in Ibadan, Nigeria a specific subphenotype are important as we move to- Funding wards applying precision medicine paradigm. The find- This work was supported by NIH Grant R01HL132344 and R03HL133713, ing indicates that African genetic ancestry at asthma Health Disparities Award of the Cincinnati Children’s Research Foundation, GWAS SNPs are differentially associated with the the National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) program, grant 1UL1TR001425–01, Methods grant from the asthma clinical subphenotypes. Unraveling the reasons Center for Clinical and Translational Science and Training, Cincinnati why individuals with higher African origin at asthma Children’s Hospital Medical Center. There is no role of the funding body in GWAS SNPs had higher IgE level or rate of positive the design of the study and collection, analysis, and interpretation of data and in writing the manuscript. SPT is necessary to determine the potential clinical ap- plications of our findings. In addition, genetic analysis Availability of data and materials based on more refined phenotypes may increase the stat- The datasets described in this manuscript were obtained from dbGaP through dbGaP accession number phs000166.v2.p1. istical power and allow for the detection of population structure-specific phenotype-genotype associations that Authors’ contributions are undetectable otherwise. LD conceptualized and designed the study, carried out and supervised the analyses, drafted the manuscript, and approved the final manuscript as submitted. DL carried out the initial analyses, reviewed and revised the Conclusions manuscript, and approved the final manuscript as submitted. MW carried out In conclusion, through our systematic clinical phenotype the analyses, reviewed and revised the manuscript, and approved the final analysis, we identified distinct subphenotypes for manuscript as submitted. MA supervised data analyses, critically reviewed the Ding et al. BMC Medical Genomics (2018) 11:51 Page 10 of 11 manuscript, and approved the final manuscript as submitted. TM conceptualized 17. Amat F, Saint-Pierre P, Bourrat E, Nemni A, Couderc R, Boutmy-Deslandes E, and designed the study, critically reviewed the manuscript, and approved the et al. Early-onset atopic dermatitis in children: which are the phenotypes at final manuscript as submitted. risk of asthma? Results from the ORCA cohort. PLoS One. 2015;10(6):e0131369. 18. Pillai SG, Tang Y, van den Oord E, Klotsman M, Barnes K, Carlsen K, et al. Ethics approval and consent to participate Factor analysis in the genetics of asthma international network family study Not applicable. identifies five major quantitative asthma phenotypes. Clin Exp Allergy. 2008;38(3):421–9. Competing interests 19. Weinmayr G, Keller F, Kleiner A, du Prel JB, Garcia-Marcos L, Batllés-Garrido J, The authors declare that they have no competing interests. et al. Asthma phenotypes identified by latent class analysis in the ISAAC phase II Spain study. Clin Exp Allergy. 2013;43(2):223–32. 20. Cherniack R, Adkinson NF, Strunk R, Szefler S, Tonascia J, Weiss S, et al. The Publisher’sNote childhood asthma management program (CAMP): design, rationale, and Springer Nature remains neutral with regard to jurisdictional claims in methods. Control Clin Trials. 1999;20(1):91–120. published maps and institutional affiliations. 21. Ding L, Abebe T, Beyene J, Wilke RA, Goldberg A, Woo JG, et al. Rank-based genome-wide analysis reveals the association of ryanodine receptor-2 gene Author details variants with childhood asthma among human populations. Hum Genomics. Division of Biostatistics and Epidemiology, Department of Pediatrics, 2013;7:16. Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA. 22. Gower JC. A general coefficient of similarity and some of its properties. Alzheimer’s Therapeutic Research Institute, Keck School of Medicine, Biometrics. 1971;27:857–74. University of Southern California, San Diego, CA, USA. Division of Asthma 23. Ward JH Jr. Hierarchical grouping to optimize an objective function. J Am Research, Department of Pediatrics, Cincinnati Children’s Hospital Medical Stat Assoc. 1963;58:236–44. Center, University of Cincinnati, 3333 Burnet Ave, Cincinnati, OH 45229, USA. 24. Milligan GW, Cooper MC. An examination of procedures for determining the number of clusters in a data set. Psychometrika. 1985;50(2):159–79. Received: 9 June 2017 Accepted: 15 May 2018 25. Cooper MC, Milligan GW. The effect of error on determining the number of clusters. Proceedings of the International Workshop on Data Analysis, Decision Support and Expert Knowledge Representation in Marketing and References Related Areas of Research; 1988. p. 319–28. 1. Borish L, Culp JA. Asthma: a syndrome composed of heterogeneous diseases. 26. Hothorn T, Hornik K, Zeileis A. Unbiased recursive partitioning: a conditional Ann Allergy Asthma Immunol. 2008;101(1):1–8. quiz -11, 50 inference framework. J Comput Graph Stat. 2006;15(3):651–74. 2. Siroux V, Garcia-Aymerich J. The investigation of asthma phenotypes. Curr 27. Team RDC. R: a language and environment for statistical computing. R Opin Allergy Clin Immunol. 2011;11(5):393–9. Foundation for Statistical Computing: Vienaa; 2010. 3. Yeatts K, Sly P, Shore S, Weiss S, Martinez F, Geller A, et al. A brief targeted 28. Alexander DH, Novembre J, Lange K. Fast model-based estimation of review of susceptibility factors, environmental exposures, asthma incidence, ancestry in unrelated individuals. Genome Res. 2009;19(9):1655–64. and recommendations for future asthma incidence research. Environ Health 29. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, et al. The Perspect. 2006;114(4):634–40. NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic 4. Guerra S, Martinez FD. Asthma genetics: from linear to multifactorial Acids Res. 2014;42(Database issue):D1001–6. approaches. Annu Rev Med. 2008;59:327–41. 30. Pritchard JK, Stephens M, Donnelly P. Inference of population structure 5. Lotvall J, Akdis CA, Bacharier LB, Bjermer L, Casale TB, Custovic A, et al. using multilocus genotype data. Genetics. 2000;155(2):945–59. Asthma endotypes: a new approach to classification of disease entities 31. Wenzel SE. Asthma phenotypes: the evolution from clinical to molecular within the asthma syndrome. J Allergy Clin Immun. 2011;127(2):355–60. approaches. Nat Med. 2012;18(5):716–25. 6. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, 32. Fitzpatrick AM, Teague WG, Meyers DA, Peters SP, Li XN, Li HS, et al. Heterogeneity et al. Finding the missing heritability of complex diseases. Nature. of severe asthma in childhood: confirmation by cluster analysis of children 2009;461(7265):747–53. in the National Institutes of Health/National Heart, Lung, and Blood Institute 7. Akinbami LJ, Schoendorf KC, Parker J. US childhood asthma prevalence severe asthma research program. J Allergy Clin Immun. 2011;127(2):382–U973. estimates: the impact of the 1997 National Health Interview Survey redesign. 33. Mersha TB, Abebe T. Self-reported race/ethnicity in the age of genomic Am J Epidemiol. 2003;158(2):99–104. research: its potential impact on understanding health disparities. Hum 8. Gamble C, Talbott E, Youk A, Holguin F, Pitt B, Silveira L, et al. Racial Genomics. 2015;9:1. differences in biologic predictors of severe asthma: data from the severe 34. Salam MT, Avoundjian T, Knight WM, Gilliland FD. Genetic ancestry and asthma research program. J Allergy Clin Immunol. 2010;126(6):1149–56. e1 asthma and rhinitis occurrence in Hispanic children: findings from the 9. Green RH, Brightling CE, Bradding P. The reclassification of asthma based on Southern California Children's health study. PLoS One. 2015;10(8):e0135384. subphenotypes. Curr Opin Allergy Clin Immunol. 2007;7(1):43–50. 35. Rumpel JA, Ahmedani BK, Peterson EL, Wells KE, Yang M, Levin AM, et al. 10. Haldar P, Pavord ID, Shaw DE, Berry MA, Thomas M, Brightling CE, et al. Genetic ancestry and its association with asthma exacerbations among African Cluster analysis and clinical asthma phenotypes. Am J Respir Crit Care Med. American subjects with asthma. J Allergy Clin Immunol. 2012;130(6):1302–6. 2008;178(3):218–24. 36. Pino-Yanes M, Thakur N, Gignoux CR, Galanter JM, Roth LA, Eng C, et al. 11. Just J, Gouvis-Echraghi R, Rouve S, Wanin S, Moreau D, Annesi-Maesano I. Genetic ancestry influences asthma susceptibility and lung function among Two novel, severe asthma phenotypes identified during childhood using a Latinos. J Allergy Clin Immunol. 2015;135(1):228–35. clustering approach. Eur Respir J. 2012;40(1):55–60. 37. Ortega VE, Kumar R. The effect of ancestry and genetic variation on lung 12. Kim TB, Jang AS, Kwon HS, Park JS, Chang YS, Cho SH, et al. Identification of function predictions: what is “normal” lung function in diverse human asthma clusters in two independent Korean adult asthma cohorts. Eur populations? Curr Allergy Asthma Rep. 2015;15(4):516. Respir J. 2013;41(6):1308–14. 38. Vergara C, Murray T, Rafaels N, Lewis R, Campbell M, Foster C, et al. African 13. Moore WC, Meyers DA, Wenzel SE, Teague WG, Li HS, Li XN, et al. Identification ancestry is a risk factor for asthma and high Total IgE levels in African of asthma phenotypes using cluster analysis in the severe asthma research admixed populations. Genet Epidemiol. 2013;37(4):393–401. program. Am J Respir Crit Care Med. 2010;181(4):315–23. 39. Menezes AM, Wehrmeister FC, Hartwig FP, Perez-Padilla R, Gigante DP, 14. Siroux V, Basagana X, Boudier A, Pin I, Garcia-Aymerich J, Vesin A, et al. Barros FC, et al. African ancestry, lung function and the effect of genetics. Identifying adult asthma phenotypes using a clustering approach. Eur Respir Eur Respir J. 2015;45(6):1582–9. J. 2011;38(2):310–7. 15. Wardlaw AJ, Silverman M, Siva R, Pavord ID, Green R. Multi-dimensional 40. Brehm JM, Acosta-Perez E, Klei L, Roeder K, Barmada MM, Boutaoui N, et al. phenotyping: towards a new taxonomy for airway disease. Clin Exp Allergy. African ancestry and lung function in Puerto Rican children. J Allergy Clin 2005;35(10):1254–62. Immunol. 2012;129(6):1484–90. e6 16. Weatherall M, Travers J, Shirtcliffe PM, Marsh SE, Williams MV, Nowitz MR, 41. Chen W, Brehm JM, Boutaoui N, Soto-Quiros M, Avila L, Celli BR, et al. Native et al. Distinct clinical phenotypes of airways disease defined by cluster American ancestry, lung function, and COPD in Costa Ricans. Chest. 2014; analysis. Eur Respir J. 2009;34(4):812–8. 145(4):704–10. Ding et al. BMC Medical Genomics (2018) 11:51 Page 11 of 11 42. Kumar R, Seibold MA, Aldrich MC, Williams LK, Reiner AP, Colangelo L, et al. Genetic ancestry in lung-function predictions. N Engl J Med. 2010;363(4):321–30. 43. Moffatt MF, Gut IG, Demenais F, Strachan DP, Bouzigon E, Heath S, et al. A large-scale, consortium-based genomewide association study of asthma. N Engl J Med. 2010;363(13):1211–21. 44. Sleiman PM, Annaiah K, Imielinski M, Bradfield JP, Kim CE, Frackelton EC, et al. ORMDL3 variants associated with asthma susceptibility in north Americans of European ancestry. J Allergy Clin Immunol. 2008;122(6):1225–7. 45. Howrylak JA, Fuhlbrigge AL, Strunk RC, Zeiger RS, Weiss ST, Raby BA, et al. Classification of childhood asthma phenotypes and long-term clinical responses to inhaled anti-inflammatory medications. J Allergy Clin Immunol. 2014;133(5):1289–300. 300 e1-12 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Medical Genomics Springer Journals

African ancestry is associated with cluster-based childhood asthma subphenotypes

Free
11 pages

Loading next page...
 
/lp/springer_journal/african-ancestry-is-associated-with-cluster-based-childhood-asthma-d01896K1xv
Publisher
Springer Journals
Copyright
Copyright © 2018 by The Author(s).
Subject
Biomedicine; Human Genetics; Microarrays; Gene Expression
eISSN
1755-8794
D.O.I.
10.1186/s12920-018-0367-5
Publisher site
See Article on Publisher Site

Abstract

Background: Childhood asthma is a syndrome composed of heterogeneous phenotypes; furthermore, intrinsic biologic variation among racial/ethnic populations suggests possible genetic ancestry variation in childhood asthma. The objective of the study is to identify clinically homogeneous asthma subphenotypes in a diverse sample of asthmatic children and to assess subphenotype-specific genetic ancestry in African-American asthmatic children. Methods: A total of 1211 asthmatic children including 813 in the Childhood Asthma Management Program and 398 in the Childhood Asthma Research and Education program were studied. Unsupervised cluster analysis on clinical phenotypes was conducted to identify homogeneous subphenotypes. Subphenotype-specific genetic ancestry was estimated for 167 African-American asthmatic children. Genetic ancestry association with subphenotypes/ clinical phenotypes were determined. Results: Three distinct subphenotypes were identified: a moderate atopic dermatitis (AD) group with negative skin prick test (SPT) and preserved lung function; a high AD group with positive SPT and airway hyperresponsiveness; and a low AD group with positive SPT and lower lung function. African ancestry at asthma genome-wide association study (GWAS) SNPs differed between subphenotypes (64, 89, and 94% for the three subphenotypes, respectively) and was inversely correlated with AD; each additional 10% increase in African ancestry was associated with 1.5 fold higher in IgE and 6.3 higher odds of positive SPT (all p-values < 0.0001). Conclusions: By conducting phenotype-based cluster analysis and assessing subphenotype-specific genetic ancestry, we were able to identify homogeneous subphenotypes for childhood asthma that showed significant variation in genetic ancestry of African-American asthmatic children. This finding demonstrates the utility of these complementary approaches to understand and refine childhood asthma subphenotypes and enable more targeted therapy. Keywords: Childhood asthma, Cluster analysis, Genetic ancestry, Subphenotypes Background study has been mostly focused on case-control disease Childhood asthma is a heterogeneous chronic airway status. Such an endpoint-based analysis ignores the com- disease with various clinical phenotypes [1, 2]. Its pheno- plexity of asthma phenotype [4–6]. In addition, although typic and biologic heterogeneity contributes to the chal- there is ample evidence for an intrinsic genetic variation lenges clinicians face in its diagnosis and effective among racial/ethnic populations [7, 8] suggesting pos- management [3]. It is therefore crucial to clearly define sible genetic ancestry variation in childhood asthma, subphenotypes of asthma with homogeneous clinical most genetic analyses rely on self-reported race thus do characteristics in order to search for better asthma man- not account for the potential contribution of genetic an- agement and to develop novel therapeutic strategies. Al- cestry to disease variation in diverse populations. though a large number of clinical phenotypes are often An approach to overcome the phenotypic heterogen- collected in childhood asthma studies, asthma genetic eity of childhood asthma is to identify homogeneous subgroups by establishing either classical “endotype”, * Correspondence: tesfaye.mersha@cchmc.org based on experts’ criteria, or statistical phenotype clus- Division of Asthma Research, Department of Pediatrics, Cincinnati Children’s tering on asthma clinical phenotypes. The latter has Hospital Medical Center, University of Cincinnati, 3333 Burnet Ave, Cincinnati, been successfully applied to identify clinically relevant OH 45229, USA Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Ding et al. BMC Medical Genomics (2018) 11:51 Page 2 of 11 subgroups of asthmatics and other airway diseases [9–17]. phs000166.v2.p1). Quality control criteria included However, these studies differ in some key elements: vari- minor allele frequency ≥ 0.05, Hardy-Weinberg equi- − 5 ation in phenotyping, analytical approaches used and the librium (p ≥ 10 ), ≤ 5% missing rate per person, ≤ 5% patient population under study. These differences limit missing rate per SNP, families with less than 5% Men- the comparability of the identified subphenotypes and del errors and SNPs with less than 10% Mendel error pose difficulty in applying clustering results to individual rate [21]. patients. Furthermore, little is understood regarding the genetic ancestry of the identified subphenotypes. The objective of the study is to investigate childhood Hierarchical cluster analysis (HCA) asthma phenotypic heterogeneity and genetic ancestry HCA is a hypothesis free statistical method to group variations and their relationships. Specifically, we used subjects into relatively homogeneous sub-clusters ac- childhood asthma data from the NIH controlled data- cording to similarity quantification based on a set of base of Genotype and Phenotype (dbGaP) to identify critical characteristic variables. The grouping is con- homogeneous subphenotypes, determine clinical pheno- structed such that the similarity is strong between mem- types, estimate subphenotype-specific genetic ancestry, and bers of the same cluster and weak between members of analyze the relationship between ancestry and subpheno- different clusters. The baseline phenotypic measures types using a stepwise approach incorporating cluster ana- listed in Table 1 were included in the cluster analysis. To lysis, classification tree analysis, and genetic ancestry reduce collinearity, we examined the variables for abso- analyses [9–16, 18, 19]. Our goal is to combine both clus- lute correlation (> 0.80). We also assessed missing pat- ter and genetic ancestry to identify biologically-relevant tern of the phenotypes and planned to exclude measures subphenotypes in childhood asthma. with ≥10% missingness from the analysis. Blood eosino- phils (EOS) and IgE were log transformed. Since we have mixed types of variables, i.e., continuous Methods and categorical, Gower’s distance [22] was used as a Data similarity index. To avoid inconsistent cluster solutions The database of Genotypes and Phenotypes (dbGaP) is due to changes in scale of the variables and heavy impact the repository for both genotype and phenotype data from of variables with larger standard deviations, Gower’s most NIH-funded GWAS and other whole-genome or ex- standardization, based on the range, was applied. HCA ome sequence data. We used baseline data from the SNP was then carried out with Ward’s minimum-variance Health Association Resource (SHARe) Asthma Resource method [23]. Consensus between a pseudo F and a Project (SHARP) (phs000166.v2.p1), the National Heart, pseudo t statistics [24, 25] was used to select the num- Lung, and Blood Institute’s clinical research trials on ber of clusters. The number of clusters was also guided asthma, specifically, the Childhood Asthma Management by clinical characteristics in addition to statistical Program (CAMP) and the Childhood Asthma Research considerations. and Education (CARE) network. The CAMP is a multi- Descriptive statistics of all variables were obtained and center, randomized, double-masked clinical trial designed compared across clusters using analysis of variance, to determine the long-term effects of three inhaled treat- Kruskal-Wallis, or Chi-square tests as appropriate. Con- ments for mild to moderate childhood asthma [20]. The ditional inference trees [26], a non-parametric class of CARE data evaluates current and novel therapies and regression trees that embeds tree structured regression management strategies for children with asthma. Individ- models into a well-defined theory of conditional infer- ual level data with asthma diagnosis is available for 1211 ence procedures, was used to identify intermediate phe- subjects through Authorized Access, including 813 in notypes that distinguish the subphenotypes. The cluster CAMP and 398 in CARE. analysis was first carried out on the CAMP data and re- An array of phenotypic variables have been harmo- peated on the CARE data. Replication of the clustering nized across the CAMP and CARE datasets, including results was examined between the two studies as well as demographics and participant characteristics; intermedi- with previously published studies. ate asthma phenotypes such as lung function, skin prick Additional analyses were run to investigate if the sub- test (SPT), serum total immunoglobulin (IgE), and phenotypes were associated with clinical outcomes. Two atopic dermatitis (AD), as well as environmental expos- clinical outcomes were examined, number of prednisone ure. See Table 1 for a complete list of variables. bursts (an anti-inflammatory oral steroid medication) We downloaded CAMP and CARE genotype data since last visit, and number of ER visit or hospitaliza- which were performed using 1 million single nucleotide tions since last visit. Number of prednisone bursts since polymorphisms (SNPs) in the Affymetrix 6.0 chip and last visit was modeled as a count variable using Poisson stored in the database of dbGaP (accession number regression with a random subject effect. Number of ER Ding et al. BMC Medical Genomics (2018) 11:51 Page 3 of 11 Table 1 Demographic, clinical phenotypes and environmental exposures of CAMP and CARE study participants CAMP (N = 813) CARE(N = 398) p-value Age, Mean (SD), years 8.9 (2.1) 10.6 (2.8) < 0.0001 Gender, No. (%) 0.8152 Male 500 (61.5) 242 (60.8) Female 313 (38.5) 156 (39.2) Race, No. (%) < 0.0001 Caucasian 557 (68.5) 215 (54) African American 107 (13.2) 70 (17.6) Hispanic 77 (9.5) 78 (19.6) Other 72 (8.9) 35 (8.8) BMIZ at baseline, Mean (SD) 0.5 (1.0) 0.8 (1.0) < 0.0001 Age of onset , Mean (SD), years 3.0 (2.4) 3.7 (3.3) < 0.0001 FEV1 PC20 meth , Mean (SD), mg/ml 2.0 (2.4) 2.2 (3.1) 0.3602 FEV1 percent predicted , Mean (SD) 93.4 (14.1) 97.1 (12.8) < 0.0001 FVC percent predicted , Mean (SD) 103.7 (13.1) 106.7 (12.2) 0.0002 FEV1/FVC ratio , Mean (SD) 79.6 (8.3) 80.1 (8.0) 0.2937 Bronchodilator percent change , Mean (SD) 10.7 (9.9) 9.4 (8.4) 0.0236 Blood eosinophils, Mean (SD), mm 485.7 (409.2) 408.8 (319.5) 0.0011 IgE, Mean (SD), ng/ml 1129.8 (2081.9) 330.6 (445.4) < 0.0001 Average AM peak flow , Mean (SD), L/min 250.9 (64.4) 271.1 (92.4) < 0.0001 Average AM symptoms , Mean (SD) 0.61 (0.45) 0.51 (0.40) < 0.0001 Environmental smoke , No. (%) 339 (41.7) 166 (41.7) 0.0256 In utero smoke , No. (%) 107 (13.2) 54 (13.6) 0.8060 Atopic dermatitis , No. (%) 199 (24.4) 155 (38.9) < 0.0001 One or more positive SPT , No. (%) 716 (88.1) 312 (78.4) 0.0002 Age at first asthma symptoms The dose of methacholine that is required to decrease FEV1 by 20% Forced expiratory volume, the maximal amount of air one can forcefully exhale in one second converted to a percentage of normal based on one’s height, weight, body composition, and race Forced vital capacity, the amount of air a person can expire after a maximum inspiration second converted to a percentage of normal based on one’s height, weight, body composition, and race Also called Tiffeneau-Pinelli index, is a calculated ratio used in the diagnosis of obstructive and restrictive lung disease. It represents the proportion of a person’s vital capacity that they are able to expire in the first second of expiration Post bronchodilator percent change from baseline: 100*(POSFEV - PREFEV)/PREFEV The maximum flow rate generated during a forceful exhalation, starting from full lung inflation; average of daily measurements up to 4 weeks prior to visit with a minimum of 7 days, recorded in daily diary card Maximum of daily wheezing and coughing then average of daily measurements up to 4 weeks prior to visit with a minimum of 7 days, recorded in daily diary card Either parent smoked during trial or home exposure to smoke prior to trial enrollment Mother smoked when pregnant with participant Child had atopic dermatitis for 2 years and was seen by a doctor for it One or more skin prick test positive visit or hospitalizations since last visit was dichotomized Genetic ancestry analysis (given over 95% of the subjects did not had an ER visit Genetic ancestry was estimated using both or hospitalization), and modeled using a logistic regression genome-wide SNPs and asthma-specific GWAS SNPs with a random subject effect. Potential covariates included for African-American asthmatic individuals in CAMP age, sex, race, visit month, time since last visit, treatment, and CARE. Supervised approach in the ADMIXTURE and subphenotypes that were significantly associated with software program [28] was use to estimated global gen- the outcome (adjusted p-value < 0.05). All analyses were etic ancestry, where SNP data of 108 YRI (Yoruba in run for CAMP and CARE data separately. All the above Ibadan, Nigeria) and 99 CEU (Utah Residents (CEPH) analyses were conducted in SAS version 9.3 (SAS Institute with Northern and Western Ancestry) individuals from Inc., Cary, NC, USA) and R [27]. the 1000 Genomes Project were included as surrogates for Ding et al. BMC Medical Genomics (2018) 11:51 Page 4 of 11 European and African ancestry. The reference populations FEV1 percent predicted and FVC percent predicted and the CAMP/CARE subjects shared 857,127 genetic (0.71) and between FEV1/FVC and maximum broncho- markers across all autosomes, which reduced to 225,374 dilator percent change (− 0.65). No variables had more SNPs after linkage disequilibrium (LD) pruning with win- than 10% of missing values. dow of 50 (kb), 10 kb window shift and a r2 value of 0.2. Asthma GWAS SNPs, 157 in total, were retrieved from HCA identified distinct subphenotypes the GWAS catalog [29] and STRUCTURE software [30] Clustering on CAMP cohort identified distinct was used to estimate African ancestry proportion at subphenotypes asthma GWAS SNPs. CEU and YRI individuals from the Three clusters were identified from CAMP data 1000 Genomes Project were used as parental populations. (Table 2). Members of cluster 1 had a moderate AD rate Correlations between genetic ancestry and the sub- (15.3%) and all but one had negative SPT (99%). This phenotypes derived by clustering and the discriminate group also had the lowest age at baseline, age at onset of factors of the subphenotypes were examined using the asthma, bronchodilator percent change, EOS, IgE level, Kruskal-Wallis test, Wilcoxon rank-sum test, Spearman AM peak flow, and AM symptoms, and highest body correlation coefficient, or linear regression as appropriate. mass index z-sore (BMIZ), PC20, FEV1 percent pre- dicted, and FEV1/FVC ratio. All these characteristics, but BMIZ and AM symptoms, were statistically different Results across the clusters at a significant level of 0.05. This is Participants from CAMP and CARE were different ex- the moderate AD group with negative SPT and pre- cept in sex, exposure to in utero smoking, PC20, and served lung function. FEV1/FVC ratio (Table 1). All pairwise Spearman correl- Members of cluster 2 had a high rate of AD (97.7%) ation coefficients were less than 0.60, except between and all had one or more positive SPT. This group also Table 2 CAMP hierarchical clustering results Cluster 1 (N = 98) Cluster 2 (N = 171) Cluster 3 (N = 544) p-value Age (years) 7.8 (1.9) 8.7 (2.1) 9.2 (2.1) < 0.0001 Gender No. (%) 0.0675 Male 50 (51.0) 105 (61.4) 345 (63.4) Female 48 (49.0) 66 (38.6) 199 (36.6) Race No. (%) 0.0153 Caucasian 82 (83.7) 116 (67.8) 359 (66.0) African American 9 (9.2) 25 (14.6) 73 (13.4) Hispanic 5 (5.1) 12 (7.0) 60 (11.0) Other 2 (2.0) 18 (10.5) 52 (9.6) BMIZ 0.7 (1.0) 0.6 (1.1) 0.5 (1.0) 0.0929 Age of onset (years) 2.4 (2.2) 2.8 (2.2) 3.2 (2.5) 0.0017 FEV1 PC20 meth (mg/ml) 2.9 (2.8) 1.8 (2.2) 2.0 (2.4) 0.0005 FEV1 percent predicted 96.3 (14.5) 95.0 (14.5) 92.4 (13.9) 0.0117 FVC percent predicted 103.5 (14.3) 104.1 (13.6) 103.6 (12.7) 0.895 FEV1/FVC ratio 82.9 (7.0) 80.6 (8.2) 78.7 (8.3) < 0.0001 Bronchodilator percent change 7.3 (6.9) 11.4 (10.1) 11.1 (10.1) 0.0012 Blood eosinophils (mm ) 228.9 (197.9) 579.7 (442.4) 504 (408.8) < 0.0001 IgE (ng/ml) 200.5 (449.1) 1579 (2624.2) 1161 (2022.7) < 0.0001 Average AM peak flow (L/min) 230.9 (55) 249.8 (67.5) 254.8 (64.4) 0.0040 Average AM symptoms 0.52 (0.40) 0.61 (0.46) 0.63 (0.45) 0.100 Environmental smoke No. (%) 42 (42.9) 60 (35.1) 237 (44.1) 0.1291 In utero smoke No. (%) 19 (19.4) 11 (6.4) 77 (14.2) 0.0046 Atopic dermatitis No. (%) 15 (15.3) 167 (97.7) 17 (3.1) < 0.0001 Positive SPT No. (%) 1 (1) 171 (100) 544 (100) < 0.0001 Mean and SD for continuous variables and No. (%) for categorical variables Ding et al. BMC Medical Genomics (2018) 11:51 Page 5 of 11 had the highest EOS and IgE level, and lowest bron- across the clusters at a significant level of 0.05. This is chodilator percent change among the 3 clusters. This the moderate AD group with negative SPT and pre- is the high AD group with positive SPT and airway served lung function similarity identified in CAMP. hyperresponsiveness. Members of cluster 2 had a high rate of AD (98.4%) Members of cluster 3 had the highest age at baseline and one or more positive SPT (95.3%). This group also and age onset of asthma and lowest BMIZ. This group had the highest EOS and IgE level among the 3 clusters. had also the lowest FEV1 percent predicted and FEV1/ This is the high AD asthma group with positive SPT and FVC ratio, and highest AM symptoms. Furthermore, airway hyperresponsiveness similarly identified in CAMP. members of cluster 3 were mostly AD free and all had Members of cluster 3 had the highest age at baseline one or more positive SPT, moderate EOS and IgE levels, and age onset of asthma, were mostly AD free (3.3%) but lower lung function measures and higher AM symp- and all had one or more positive SPT (92.2%), had mod- toms compared to the other clusters. This is the low AD erate EOS and IgE levels, but higher AM symptoms group with positive SPT and lower lung function. compared to the other clusters. This is the low AD group with positive SPT and lower lung function simi- Clustering on CARE cohort replicated the subphenotypes larly identified in CAMP. identified in CAMP Three clusters were identified in CARE (Table 3). Mem- Atopic dermatitis status and SPT distinguished the bers of cluster 1 had a moderate rate of AD (35%) and subphenotypes none of them had a positive SPT. This group also had Conditional inference trees analysis revealed that, in the lowest bronchodilator percent change, EOS, IgE, both CAMP and CARE data, AD and one or more posi- AM peak flow, and lowest AM symptoms. All these tive SPT were the top two variables that best discrimi- characteristics, but the last, were statistically different nated the individuals into the subphenotypes (Fig. 1, Table 3 CARE hierarchical clustering results Cluster 1 (N = 60) Cluster 2 (N = 129) Cluster 3 (N = 209) p-value Age (years) 10.1 (2.4) 10.1 (2.5) 11.0 (3.1) 0.0124 Gender No. (%) 0.1185 Male 30 (50) 77 (59.7) 135 (64.6) Female 30 (50) 52 (40.3) 74 (35.4) Race No. (%) 0.4519 Caucasian 40 (66.7) 67 (51.9) 108 (51.7) African American 8 (13.3) 24 (18.6) 38 (18.2) Hispanic 8 (13.3) 24 (18.6) 46 (22.0) Other 4 (6.7) 14 (10.9) 17 (8.1) BMIZ 0.9 (0.9) 0.8 (1.0) 0.8 (1.0) 0.5920 Age of onset (years) 3.6 (3.5) 3.1 (2.6) 4.1 (3.5) 0.0215 FEV1 PC20 meth (mg/ml) 3.3 (3.3) 1.6 (2.4) 2.3 (3.4) 0.0031 FEV1 percent predicted 97.2 (13.4) 96.3 (13.1) 97.6 (12.5) 0.655 FVC percent predicted 104.7 (10.7) 107.2 (12.5) 106.9 (12.3) 0.378 FEV1/FVC ratio 81.6 (8.5) 79.0 (8.0) 80.4 (7.9) 0.101 Bronchodilator percent change 6.7 (7.4) 9.9 (7.4) 9.8 (9.0) 0.0271 Blood eosinophils (mm ) 245.7 (211.5) 444.4 (322.1) 435.0 (330.2) < 0.0001 IgE (ng/ml) 63.5 (133.9) 424.5 (537.1) 347.4 (430.1) < 0.0001 Average AM peak flow (L/min) 255.4 (68.7) 258.6 (81.0) 283.3 (102.3) 0.0209 Average AM symptoms 0.43 (0.32) 0.50 (0.40) 0.53 (0.42) 0.202 Environmental smoke No. (%) 28 (46.7) 62 (48.1) 104 (49.8) 0.8985 In utero smoke No. (%) 1 (1.7) 18 (14.0) 35 (16.9) 0.0121 Atopic dermatitis No. (%) 21 (35) 127 (98.4) 7 (3.3) < 0.0001 Positive SPT No. (%) 0 (0) 123 (95.3) 189 (92.2) < 0.0001 Mean and SD for continuous variables and No. (%) for categorical variables Ding et al. BMC Medical Genomics (2018) 11:51 Page 6 of 11 Fig. 1 Conditional inference tree analysis of the three subphenotypes. SPT and atopic dermatitis are the top two factors distinguishing the subphenotypes. The prediction accuracy is 95.8% prediction accuracy 95.8%). Given the consistent find- Genetic ancestry proportion varied at asthma GWAS SNPs ings across CAMP and CARE data, we combined the among asthma subphenotypes two datasets and grouped the three clusters individually The three subphenotypes had 15, 49, and 103 African identified in CAMP and CARE into three subpheno- American individuals, respectively. Global African ances- types. One subphenotype was the moderate AD group try proportion varies from 71.2 to 100% with mean with negative SPT and preserved lung function (subphe- 96.6% and standard deviation (SD) 7.2%. Higher global notype 1, n = 158), one was the high AD group with po- African ancestry was associated with AD (mean ± SD of sitive SPT and airway hyperresponsiveness (subphenotype African origin is 0.96 ± 0.08 for AD free vs. 0.98 ± 0.06 2, n = 300), and one was the low AD group with positive for AD subjects, p-value = 0.0294), but not with other SPT and lower lung function (subphenotype 3, n =753). clinical phenotypes. Proportion of African ancestry at asthma GWAS SNPs was correlated with the subpheno- Subphenotypes were associated clinical outcomes types (mean 64.9, 89.4 and 94.4% for subphenotypes 1, Table 4 shows the association between the subpheno- 2, and 3, respectively, p-value < 0.0001, Figs. 2 and 3(a)). types and clinical outcomes. In CAMP data, the incident The subphenotypes were associated with lung function: rate of prednisone bursts since last visit for subpheno- FEV1 percent predicted is 96.8 ± 14.1, 95.3 ± 13.9, and type 2 is 2.63 (1.45, 2.70) times the incident rate for sub- 93.9 ± 13.7 (p-value = 0.0083); and FEV1/FVC ratio is phenotype 1, and the incident rate of prednisone bursts 81.9 ± 7.6, 80.5 ± 8.1, and 79.0 ± 8.2 (p-value < 0.0001) since last visit for subphenotype 3 is 2.04 (1.56, 2.70) for subphenotypes 1, 2, and 3, respectively. Furthermore, times the incident rate for subphenotype 1. Also in African ancestry at asthma GWAS SNPs was inversely CAMP data, the odds of any ER visit or hospitalizations associated with AD (median 0.95 with IQR (0.93, 0.95) since last visit for subphenotype 3 is 1.54 (1.01, 2.23) for AD free vs. 0.92 (0.89, 0.94) for AD subjects, p-value times the odds for subphenotype 1. For CARE data, the < 0.0001, Fig. 3(b)). Additionally, genetic ancestry at odds of any ER visit or hospitalizations since last visit asthma GWAS SNPs was associated with positive SPT for subphenotype 2 is 0.32 (0.13, 0.98) times the odds with median and interquartile range (IQR) 0.94 (0.92, for subphenotype 1, and the odds of any ER visit or hos- 0.95) for positive SPT individuals vs. 0.74 with IQR pitalizations since last visit for subphenotype 3 is 3.45 (0.59, 0.78) for negative SPT individuals (p-value < (1.47, 7.69) times the odds for subphenotype 2. 0.0001, Fig. 3(c)). The odds of one or more positive SPT Ding et al. BMC Medical Genomics (2018) 11:51 Page 7 of 11 Table 4 Association between subphenotypes and number of prednisone bursts and any ER visit or hospitalizations since last visit Number of prednisone bursts since last visit CAMP Predicted number of event Incident rate ratios Subphenotype Estimate (95% CI) p-value Subphenotypes IRR (95% CI) p-value 1 0.10 (0.08, 0.13) < 0.0001 2 vs. 1 2.63 (1.45, 2.70) < 0.0001 2 0.20 (0.16, 0.24) 3 vs. 1 2.04 (1.56, 2.70) < 0.0001 3 0.20 (0.18, 0.22) 3 vs. 2 1.02 (0.83, 1.27) 0.8153 CARE Predicted number of event Incident rate ratios Subphenotype Estimate (95% CI) p-value Subphenotypes IRR (95% CI) p-value 1 0.08 (0.05, 0.14) 0.3534 2 vs. 1 0.93 (0.57, 1.54) 0.7880 2 0.08 (0.05, 0.12) 3 vs. 1 1.19 (0.76, 1.89) 0.4420 3 0.10 (0.07, 0.14) 3 vs. 2 1.28 (0.90, 1.82) 0.1666 Any ER visit or hospitalizations since last visit CAMP Predicted probability Odds ratios Subphenotype Estimate (95% CI) p-value Subphenotypes OR (95% CI) p-value 1 0.03 (0.02, 0.04) 0.1232 2 vs. 1 1.52 (0.95, 2.44) 0.0776 2 0.04 (0.03, 0.05) 3 vs. 1 1.54 (1.01, 2.33) 0.0434 3 0.04 (0.03, 0.04) 3 vs. 2 1.01 (0.75, 1.37) 0.9474 CARE Predicted probability Odds ratios Subphenotype Estimate (95% CI) p-value Subphenotypes OR (95% CI) p-value 1 0.02 (0.01, 0.05) 0.0155 2 vs. 1 0.35 (0.13, 0.98) 0.0458 2 0.01 (0.004, 0.02) 3 vs. 1 1.20 (0.56, 2.63) 0.6296 3 0.03 (0.02, 0.04) 3 vs. 2 3.45 (1.47, 7.69) 0.0039 Fig. 2 Population ancestry estimates of African American asthmatic individuals in CAMP and CARE at asthma GWAS SNPs by subphenotypes. Dashed lines indicate average proportions of African ancestry proportion at the asthma GWAS SNPs. Ibadan, Nigeria (YRI) and northern and western European (CEU) from the 1000 Genomes project were used as parental populations Ding et al. BMC Medical Genomics (2018) 11:51 Page 8 of 11 Fig. 3 Boxplots and scatterplot of proportions of African ancestry at the asthma GWAS SNPs by: a subphenotypes, b Atopic dermatitis status, c SPT, and d IgE levels was 6.3 higher (95% confidence interval: (3.4, 13.8), knowledge, our study is the first to show the association p-value < 0.0001) with each additional 10% of African between genetic ancestry at asthma GWAS SNPs and origin at asthma GWAS SNPs. African origin at asthma cluster-based subphenotypes in childhood asthma. Lever- GWAS SNPs was also associated with IgE levels (Spear- aging ancestry and cluster analyses to derive genetic and man correlation coefficient = 0.27, p-value = 0.0004) and phenotypic homogeneity subgroups in childhood asthma IgE was 1.5 fold higher with each additional 10% of Afri- demonstrates the utility of these approaches to can origin (Fig. 3(d)). characterize and understand the complexity of asthma to- wards individual based precision medicine strategies. Discussion This study demonstrates that genetic ancestry at Current clinical practice in childhood asthma treatment asthma GWAS SNPs is more strongly associated with tends to use average patient care strategies. Such a “one asthma subgroups sharing similar clinical characteristics size fits all” treatment approach faces major challenges compared to broadly defined asthma. The results suggest when it is becoming clearer that childhood asthma is that validation of genetic studies are more likely to be heterogeneous in pathogenesis. Our unbiased cluster successful for replication studies carried-out in more and genetic ancestry analyses pointed toward three dis- homogeneous asthma cohorts (sharing similar clinical char- tinct phenotypic clusters with differences in clinical acteristics) compared to the multifactorial case-control sta- characteristics, genetic ancestry, and clinical outcomes, tus. In addition, the results indicate that ancestry-specific underscoring the clinical and genetic heterogeneity of genetic lociof asthmaare likelytobefound by focusing on asthma [10, 13, 17, 31]. Previous studies have also identi- better defined asthma patients. Furthermore, genetic ances- fied clusters with atopic or non-atopic asthma, clusters try analysis in homogeneous asthma subgroups is suitable with preserved or lower lung function, and clusters with to refine the biological role of asthma susceptibility variants mild asthma [13, 14, 32]. It is reassuring that the two in- from GWAS studies in a given phenotype. For example, dependent studies replicated the clustering results and SNPs at STARD3/PGAP3 arestronglyassociatedwiththe there are similarities with previous clustering-based high atopic dermatitis subgroup suggesting that STARD3/ childhood asthma subphenotypes. PGAP3 may act on the allergic component of asthma [43]. We determined genetic ancestry [33] using genome-wide Another example is that ORMDL3/17q locus is associated SNPs and asthma GWAS SNPs for African-American with asthma in multiple studies in the European ancestry asthmatic individuals in CAMP and CARE data. Our esti- but not in African ancestry asthmatic individuals [44]. We mate of African global ancestry in asthmatic children is also investigated associations between asthma GWAS SNPs higher than what has been reported in different general with the identified subphenotypes in CAMP and CARE populations confirming the higher prevalence of asthma in data (methods and results in Additional file 1:Table S1). individuals with higher African ancestry than others. Our Several significant associations were identified at p =0.05, results showed that genetic ancestry at asthma GWAS but none after multiplicity adjustment, possibly due to SNPs differed between the childhood asthma subpheno- smallsamplesizeand limitedstatistical power. types and was associated with lung function, SPT, IgE Our study had several limitations. First, participants in levels, and AD. Previous studies have also showed associ- CAMP and CARE represent studies of childhood ation between genetic ancestry and asthma prevalence and asthma, thus the results herein may not be applicable to related clinical phenotypes [34–42]. To our best adulthood asthma. Second, although we identified Ding et al. BMC Medical Genomics (2018) 11:51 Page 9 of 11 clinically relevant subphenotypes of childhood asthma childhood asthma using cluster analysis. Further genetic using clinical phenotypes [45], the integration of this re- ancestry analysis showed correlations between African sult with molecular and physiologic phenotyping may help ancestry at asthma GWAS SNPs and childhood asthma to better understand childhood asthma pathogenesis for subphenotypes and related clinical outcomes. Our re- possibly more personalized therapeutic strategies. Further- sults demonstrated that cluster analyses on clinical phe- more, subgroup analyses of asthma may limit sample sizes notypes followed by ancestry analysis can enhance the and impair statistical power. However, given asthma is a understanding of the phenotypic and genetic heterogen- highly heterogeneous phenotype, studying homogeneous eity of childhood asthma. Our approach is distinct from subgroups of asthma patients not only recovers power previous efforts in that we developed cluster-based sub- limitation, but achieves more statistically significant re- phenotype and applied genetic ancestry analysis to de- sults. Classifying asthma patients in more homogenous fine subphenotype-ancestry relationships which can be groups may help us to identify new susceptibility or modi- subsequently used as the basis of genetic ancestry based fying subphenotype-specific genes. Our ability to better clinical risk prediction. Our findings suggest that defin- define subtypes might help to predict who may respond to ing asthma heterogeneous subgroups on the basis of treatment vs subjects who may not. Future studies need to clinical phenotypes and genetic ancestry proportion is elucidate the mechanisms that distinguish each ancestral an essential step to understand and refine patient sub- and clinical clusters to facilitate the development of tar- sets and enable more targeted therapy. geted therapies and providing personalized treatments. The present study has notable strengths. First, we were Additional file able to dissect childhood asthma heterogeneity into sub- Additional file 1: Table S1. Association between asthma GWAS SNPs phenotypes using cluster analysis of clinical phenotypes and subphenotypes. This file contains association results between asthma in one study and replicate the findings in an independ- GWAS SNPs with the identified subphenotypes in CAMP and CARE data. ent study. Second, we were able to show associations be- (DOCX 24 kb) tween the identified subphenotypes with asthma clinical outcomes. Third, analysis of genetic ancestry at asthma Abbreviations GWAS SNPs in childhood asthma clinical phenotypes AD: Atopic dermatitis; BMIZ: Body mass index z-sore; CAMP: the Childhood Asthma Management Program; CARE: the Childhood Asthma Research and provide biologically relevant subphenotype-specific re- Education network; CEU: Utah Residents with Northern and Western sults. Lastly, our study used a more accurate and direct Ancestry; dbGaP: The database of Genotypes and Phenotypes; EOS: Blood assessment of genetic ancestry instead of self-reported eosinophils; FEV1: Forced expiratory volume, the maximal amount of air one can forcefully exhale in 1 s converted to a percentage of normal based on race to determine the relationship between ancestry and one’s height, weight, body composition, and race; FVC: Forced vital capacity, childhood asthma subphenotypes and relevant clinical the amount of air a person can expire after a maximum inspiration second phenotypes. Studies have shown that people with the converted to a percentage of normal based on one’s height, weight, body composition, and race; GWAS: Genome-wide association study; same self-reported race could have drastically different HCA: Hierarchical cluster analysis; IgE: Serum total immunoglobulin; levels of genetic ancestry, and self-reported race may not IQR: Interquartile range; LD: Linkage disequilibrium; PC20: The dose of be as accurate as direct assessment of genetic ancestry methacholine that is required to decrease FEV1 by 20%; SD: Standard deviation; SHARe: The SNP Health Association Resource; SHARP: The SNP in predicting treatment outcomes [33]. Future studies to Health Association Resource Asthma Resource Project; SNP: Single nucleotide identify genetic ancestry-specific variants associated with polymorphism; SPT: Skin prick test; YRI: Yoruba in Ibadan, Nigeria a specific subphenotype are important as we move to- Funding wards applying precision medicine paradigm. The find- This work was supported by NIH Grant R01HL132344 and R03HL133713, ing indicates that African genetic ancestry at asthma Health Disparities Award of the Cincinnati Children’s Research Foundation, GWAS SNPs are differentially associated with the the National Institutes of Health (NIH) Clinical and Translational Science Award (CTSA) program, grant 1UL1TR001425–01, Methods grant from the asthma clinical subphenotypes. Unraveling the reasons Center for Clinical and Translational Science and Training, Cincinnati why individuals with higher African origin at asthma Children’s Hospital Medical Center. There is no role of the funding body in GWAS SNPs had higher IgE level or rate of positive the design of the study and collection, analysis, and interpretation of data and in writing the manuscript. SPT is necessary to determine the potential clinical ap- plications of our findings. In addition, genetic analysis Availability of data and materials based on more refined phenotypes may increase the stat- The datasets described in this manuscript were obtained from dbGaP through dbGaP accession number phs000166.v2.p1. istical power and allow for the detection of population structure-specific phenotype-genotype associations that Authors’ contributions are undetectable otherwise. LD conceptualized and designed the study, carried out and supervised the analyses, drafted the manuscript, and approved the final manuscript as submitted. DL carried out the initial analyses, reviewed and revised the Conclusions manuscript, and approved the final manuscript as submitted. MW carried out In conclusion, through our systematic clinical phenotype the analyses, reviewed and revised the manuscript, and approved the final analysis, we identified distinct subphenotypes for manuscript as submitted. MA supervised data analyses, critically reviewed the Ding et al. BMC Medical Genomics (2018) 11:51 Page 10 of 11 manuscript, and approved the final manuscript as submitted. TM conceptualized 17. Amat F, Saint-Pierre P, Bourrat E, Nemni A, Couderc R, Boutmy-Deslandes E, and designed the study, critically reviewed the manuscript, and approved the et al. Early-onset atopic dermatitis in children: which are the phenotypes at final manuscript as submitted. risk of asthma? Results from the ORCA cohort. PLoS One. 2015;10(6):e0131369. 18. Pillai SG, Tang Y, van den Oord E, Klotsman M, Barnes K, Carlsen K, et al. Ethics approval and consent to participate Factor analysis in the genetics of asthma international network family study Not applicable. identifies five major quantitative asthma phenotypes. Clin Exp Allergy. 2008;38(3):421–9. Competing interests 19. Weinmayr G, Keller F, Kleiner A, du Prel JB, Garcia-Marcos L, Batllés-Garrido J, The authors declare that they have no competing interests. et al. Asthma phenotypes identified by latent class analysis in the ISAAC phase II Spain study. Clin Exp Allergy. 2013;43(2):223–32. 20. Cherniack R, Adkinson NF, Strunk R, Szefler S, Tonascia J, Weiss S, et al. The Publisher’sNote childhood asthma management program (CAMP): design, rationale, and Springer Nature remains neutral with regard to jurisdictional claims in methods. Control Clin Trials. 1999;20(1):91–120. published maps and institutional affiliations. 21. Ding L, Abebe T, Beyene J, Wilke RA, Goldberg A, Woo JG, et al. Rank-based genome-wide analysis reveals the association of ryanodine receptor-2 gene Author details variants with childhood asthma among human populations. Hum Genomics. Division of Biostatistics and Epidemiology, Department of Pediatrics, 2013;7:16. Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA. 22. Gower JC. A general coefficient of similarity and some of its properties. Alzheimer’s Therapeutic Research Institute, Keck School of Medicine, Biometrics. 1971;27:857–74. University of Southern California, San Diego, CA, USA. Division of Asthma 23. Ward JH Jr. Hierarchical grouping to optimize an objective function. J Am Research, Department of Pediatrics, Cincinnati Children’s Hospital Medical Stat Assoc. 1963;58:236–44. Center, University of Cincinnati, 3333 Burnet Ave, Cincinnati, OH 45229, USA. 24. Milligan GW, Cooper MC. An examination of procedures for determining the number of clusters in a data set. Psychometrika. 1985;50(2):159–79. Received: 9 June 2017 Accepted: 15 May 2018 25. Cooper MC, Milligan GW. The effect of error on determining the number of clusters. Proceedings of the International Workshop on Data Analysis, Decision Support and Expert Knowledge Representation in Marketing and References Related Areas of Research; 1988. p. 319–28. 1. Borish L, Culp JA. Asthma: a syndrome composed of heterogeneous diseases. 26. Hothorn T, Hornik K, Zeileis A. Unbiased recursive partitioning: a conditional Ann Allergy Asthma Immunol. 2008;101(1):1–8. quiz -11, 50 inference framework. J Comput Graph Stat. 2006;15(3):651–74. 2. Siroux V, Garcia-Aymerich J. The investigation of asthma phenotypes. Curr 27. Team RDC. R: a language and environment for statistical computing. R Opin Allergy Clin Immunol. 2011;11(5):393–9. Foundation for Statistical Computing: Vienaa; 2010. 3. Yeatts K, Sly P, Shore S, Weiss S, Martinez F, Geller A, et al. A brief targeted 28. Alexander DH, Novembre J, Lange K. Fast model-based estimation of review of susceptibility factors, environmental exposures, asthma incidence, ancestry in unrelated individuals. Genome Res. 2009;19(9):1655–64. and recommendations for future asthma incidence research. Environ Health 29. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, et al. The Perspect. 2006;114(4):634–40. NHGRI GWAS catalog, a curated resource of SNP-trait associations. Nucleic 4. Guerra S, Martinez FD. Asthma genetics: from linear to multifactorial Acids Res. 2014;42(Database issue):D1001–6. approaches. Annu Rev Med. 2008;59:327–41. 30. Pritchard JK, Stephens M, Donnelly P. Inference of population structure 5. Lotvall J, Akdis CA, Bacharier LB, Bjermer L, Casale TB, Custovic A, et al. using multilocus genotype data. Genetics. 2000;155(2):945–59. Asthma endotypes: a new approach to classification of disease entities 31. Wenzel SE. Asthma phenotypes: the evolution from clinical to molecular within the asthma syndrome. J Allergy Clin Immun. 2011;127(2):355–60. approaches. Nat Med. 2012;18(5):716–25. 6. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, 32. Fitzpatrick AM, Teague WG, Meyers DA, Peters SP, Li XN, Li HS, et al. Heterogeneity et al. Finding the missing heritability of complex diseases. Nature. of severe asthma in childhood: confirmation by cluster analysis of children 2009;461(7265):747–53. in the National Institutes of Health/National Heart, Lung, and Blood Institute 7. Akinbami LJ, Schoendorf KC, Parker J. US childhood asthma prevalence severe asthma research program. J Allergy Clin Immun. 2011;127(2):382–U973. estimates: the impact of the 1997 National Health Interview Survey redesign. 33. Mersha TB, Abebe T. Self-reported race/ethnicity in the age of genomic Am J Epidemiol. 2003;158(2):99–104. research: its potential impact on understanding health disparities. Hum 8. Gamble C, Talbott E, Youk A, Holguin F, Pitt B, Silveira L, et al. Racial Genomics. 2015;9:1. differences in biologic predictors of severe asthma: data from the severe 34. Salam MT, Avoundjian T, Knight WM, Gilliland FD. Genetic ancestry and asthma research program. J Allergy Clin Immunol. 2010;126(6):1149–56. e1 asthma and rhinitis occurrence in Hispanic children: findings from the 9. Green RH, Brightling CE, Bradding P. The reclassification of asthma based on Southern California Children's health study. PLoS One. 2015;10(8):e0135384. subphenotypes. Curr Opin Allergy Clin Immunol. 2007;7(1):43–50. 35. Rumpel JA, Ahmedani BK, Peterson EL, Wells KE, Yang M, Levin AM, et al. 10. Haldar P, Pavord ID, Shaw DE, Berry MA, Thomas M, Brightling CE, et al. Genetic ancestry and its association with asthma exacerbations among African Cluster analysis and clinical asthma phenotypes. Am J Respir Crit Care Med. American subjects with asthma. J Allergy Clin Immunol. 2012;130(6):1302–6. 2008;178(3):218–24. 36. Pino-Yanes M, Thakur N, Gignoux CR, Galanter JM, Roth LA, Eng C, et al. 11. Just J, Gouvis-Echraghi R, Rouve S, Wanin S, Moreau D, Annesi-Maesano I. Genetic ancestry influences asthma susceptibility and lung function among Two novel, severe asthma phenotypes identified during childhood using a Latinos. J Allergy Clin Immunol. 2015;135(1):228–35. clustering approach. Eur Respir J. 2012;40(1):55–60. 37. Ortega VE, Kumar R. The effect of ancestry and genetic variation on lung 12. Kim TB, Jang AS, Kwon HS, Park JS, Chang YS, Cho SH, et al. Identification of function predictions: what is “normal” lung function in diverse human asthma clusters in two independent Korean adult asthma cohorts. Eur populations? Curr Allergy Asthma Rep. 2015;15(4):516. Respir J. 2013;41(6):1308–14. 38. Vergara C, Murray T, Rafaels N, Lewis R, Campbell M, Foster C, et al. African 13. Moore WC, Meyers DA, Wenzel SE, Teague WG, Li HS, Li XN, et al. Identification ancestry is a risk factor for asthma and high Total IgE levels in African of asthma phenotypes using cluster analysis in the severe asthma research admixed populations. Genet Epidemiol. 2013;37(4):393–401. program. Am J Respir Crit Care Med. 2010;181(4):315–23. 39. Menezes AM, Wehrmeister FC, Hartwig FP, Perez-Padilla R, Gigante DP, 14. Siroux V, Basagana X, Boudier A, Pin I, Garcia-Aymerich J, Vesin A, et al. Barros FC, et al. African ancestry, lung function and the effect of genetics. Identifying adult asthma phenotypes using a clustering approach. Eur Respir Eur Respir J. 2015;45(6):1582–9. J. 2011;38(2):310–7. 15. Wardlaw AJ, Silverman M, Siva R, Pavord ID, Green R. Multi-dimensional 40. Brehm JM, Acosta-Perez E, Klei L, Roeder K, Barmada MM, Boutaoui N, et al. phenotyping: towards a new taxonomy for airway disease. Clin Exp Allergy. African ancestry and lung function in Puerto Rican children. J Allergy Clin 2005;35(10):1254–62. Immunol. 2012;129(6):1484–90. e6 16. Weatherall M, Travers J, Shirtcliffe PM, Marsh SE, Williams MV, Nowitz MR, 41. Chen W, Brehm JM, Boutaoui N, Soto-Quiros M, Avila L, Celli BR, et al. Native et al. Distinct clinical phenotypes of airways disease defined by cluster American ancestry, lung function, and COPD in Costa Ricans. Chest. 2014; analysis. Eur Respir J. 2009;34(4):812–8. 145(4):704–10. Ding et al. BMC Medical Genomics (2018) 11:51 Page 11 of 11 42. Kumar R, Seibold MA, Aldrich MC, Williams LK, Reiner AP, Colangelo L, et al. Genetic ancestry in lung-function predictions. N Engl J Med. 2010;363(4):321–30. 43. Moffatt MF, Gut IG, Demenais F, Strachan DP, Bouzigon E, Heath S, et al. A large-scale, consortium-based genomewide association study of asthma. N Engl J Med. 2010;363(13):1211–21. 44. Sleiman PM, Annaiah K, Imielinski M, Bradfield JP, Kim CE, Frackelton EC, et al. ORMDL3 variants associated with asthma susceptibility in north Americans of European ancestry. J Allergy Clin Immunol. 2008;122(6):1225–7. 45. Howrylak JA, Fuhlbrigge AL, Strunk RC, Zeiger RS, Weiss ST, Raby BA, et al. Classification of childhood asthma phenotypes and long-term clinical responses to inhaled anti-inflammatory medications. J Allergy Clin Immunol. 2014;133(5):1289–300. 300 e1-12

Journal

BMC Medical GenomicsSpringer Journals

Published: May 31, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off