Humans experience higher rates of age-associated diseases than our closest living evolutionary relatives, chimpanzees. Environmental factors can explain many of these increases in disease risk, but species-speciﬁc genetic changes can also play a role. Alleles that confer increased disease susceptibility later in life can persist in a population in the absence of selective pressure if those changes confer positive adaptation early in life. One age-associated disease that dispropor- tionately affects humans compared with chimpanzees is epithelial cancer. Here, we explored genetic differences between humans and chimpanzees in a well-deﬁned experimental assay that mimics gene expression changes that happen during cancer progression: A ﬁbroblast serum challenge. We used this assay with ﬁbroblasts isolated from humans and chim- panzees to explore species-speciﬁc differences in gene expression and chromatin state with RNA-Seq and DNase-Seq. Our data reveal that human ﬁbroblasts increase expression of genes associated with wound healing and cancer pathways; in contrast, chimpanzee gene expression changes are not concentrated around particular functional categories. Chromatin accessibility dramatically increases in human ﬁbroblasts, yet decreases in chimpanzee cells during the serum response. Many regions of opening and closing chromatin are in close proximity to genes encoding transcription factors or genes involved in wound healing processes, further supporting the link between changes in activity of regulatory elements and changes in gene expression. Together, these expression and open chromatin data show that humans and chimpanzees have dramatically different responses to the same physiological stressor, and how a core physiological process can evolve quickly over relatively short evolutionary time scales. Key words: human evolution, late-onset disease, ﬁbroblast, wound healing, epithelial cancer. Introduction Medawar 1952; Williams 1957). In the absence of purifying Deleterious genetic changes affecting traits that manifest later selection, late-onset disease alleles can persist or accumulate in life tend to accumulate in long-lived species. Evolutionary in a population. Deleterious mutations affecting late-life traits theory of aging explains that selective forces are weaker on can, however, experience positive selection if they confer pos- traits that manifest later in life compared with those that af- itively adaptive changes earlier in life (Carter and Nguyen fect survival or fecundity earlier in life (Hamilton 1966; 2011; Crespi and Summers 2006; Williams 1957). This The Author(s) 2018. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. 826 Genome Biol. Evol. 10(3):826–839. doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Human and Chimpanzee GBE antagonistic pleiotropic theory of aging may explain why expression proﬁle predicts a greater risk for metastasis and humans, who have long life spans compared with other non- death for breast, lung, and stomach carcinomas (Chang et al. human primates, experience higher rates of diseases that 2005). Because humans and chimpanzees have signiﬁcantly manifest later in life compared with our closest living evolu- different cancer rates, we hypothesized that the serum re- tionary relatives (Crespi 2010; Finch 2010). sponse would be signiﬁcantly different between species. One of the most prominent diseases of aging affecting the In order to identify the genetic differences that may drive human population is epithelial cancer. In the United States, these important differences in disease phenotypes, we investi- 86% of cancers are diagnosed in people over 50 years of age gated global patterns of gene expression in serum-challenged (American Cancer Society 2016). There is also a striking dif- ﬁbroblasts from humans and chimpanzees using RNA-Seq. All ference in the frequency of epithelial cancer in humans com- of the genes tested in our assay were analyzed only if there paredwithchimpanzees (Varki and Varki 2015). In modern were orthologous genes in both species (Blekhman et al. 2010). human populations, epithelial cancers cause up to 20% of To measure dynamic changes in the chromatin landscape, we deaths, but in our nearest living relatives, chimpanzees, rates also sequenced open chromatin using DNase-Seq (Boyle 2008; of epithelial cancers are up to 10-fold lower (American Cancer Song and Crawford 2010) from the same cell population. As a Society 2016; Beniashvili 1989; Hedlund et al. 2007; McClure way to investigate biological implications of changes in gene 1973; Parker et al. 1997; Schmidt 1975; Scott 1992; Seibold expression and chromatin accessibility, we test for categorical and Wolf 1973). It is quite clear that environmental, lifestyle, enrichment within gene ontologies, KEGG pathways, and pre- and dietary factors drive cancer risk (Wu et al. 2016), but deﬁned gene sets that are based on knowledge about biolog- genomic differences in humans as compared with other pri- ical functions. This approach allows us to quantify enrichments mate species could also play a role. Previous studies suggest based on expression differences and use statistical methods to that genetic changes in genes associated with cancer are un- identify signiﬁcant changes within gene categories as com- der positive selection in humans and can increase aspects of pared with the background set of all genes measured in this ﬁtness early in life (Crespi and Summers 2006). While positive study (Huang et al. 2009b; Subramanian et al. 2005). In our selection in regulatory regions does not strictly inform expres- analysis, we found that human ﬁbroblasts undergo distinct sion differences between species on a gene-by-gene basis, physiological changes in response to a serum challenge, includ- there is a stronger correlation at the level of biological process ing activation of genes involved in homeostasis and cell death. ontology function (Babbitt et al. 2017), and the differentially Chimpanzee ﬁbroblasts, however, have a much less focused expressed genes measured there had an enrichment with response, where many genes show differential expression cancer-related genes. A similar pattern was found by without signiﬁcant gene ontology enrichment. We also see Nielsen et al. (2005) for genes showing evidence of positive that serum challenge elicits a general increase in chromatin selection in coding regions. Although cancer-related genes accessibility in human cells and decreased accessibility in chim- were identiﬁed among the top 50 genes with signs of positive panzee cells. Thus, by using this serum challenge assay in a selection, categorical enrichment for cancer genes within that comparative way, we have identiﬁed pathways that differ be- set was not tested. Increased cancer susceptibility, therefore, tween species and broad differences in the activation state of may be a traitthathas come as a tradeoffas biological pro- cells in response to serum-induced cell stress. This study shows cesses evolved in humans. Because of the close evolutionary that by using a comparative genomic approach in a model of relationship between humans and chimpanzees, understand- wound healing and epithelial cancer, we can gain valuable ing genetic differences that contribute to disease phenotypes insights into how recent genetic adaptations contribute to dif- such as cancer susceptibility can assist in understanding im- ferential disease phenotypes between closely related species. portant patterns of functional genetic changes that occurred relatively recently during human evolution. In this study, we harnessed the power of a well-established Materials and Methods experimental assay that models cancer gene expression pat- Serum Challenge terns, allowing us to test the responses of human and chim- panzee cells. When grown in culture, ﬁbroblasts exposed to Fibroblast cell lines were obtained from the Coriell Institute for serum undergo a coordinated pattern of gene expression Medical Research (Camden, NJ) from four male humans and changes that mimics the wound healing response (Iyer et al. chimpanzees (Pan troglodytes; supplementary table S1, 1999). Tumors have been likened to wounds that do not heal Supplementary Material online). Our cell culture methods ap- (Dvorak 1986), and these changes in challenged ﬁbroblast proximately followed those of Chang et al. (2004) and Iyer gene expression were then subsequently found to strongly et al. (1999). Brieﬂy, we seeded cells in media with FBS correlate with gene expression data from epithelial cancer tis- (Hyclone deﬁned FBS (-)HI, Fisher) at 50% conﬂuency and sue (Chang et al. 2004). Chang et al. (2005) identiﬁed a set of grew overnight. At 60% conﬂuency one set of plates were set core serum response (CSR) genes that are up- or downregu- aside and processed as described below (ﬁg. 1A,“Pre- lated independent of the cell-cycle, and the CSR gene challenge” time point). The remainder of the cells were Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 827 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Pizzollo et al. GBE CSR Genes in CSR Genes in Human T0 to T12 Chimpanzee T0 to T12 CSR-Down Genes Pre Time 0 Time 12 Time 24 CSR-Up Serum replace Genes Serum starve 48 hrs RNA-seq −4 −2 0 2 4 −4 −2 0 2 4 DNase-seq log fold-change log fold-change 2 2 C CSR Down GO Reg. Cell Adhesion KEGG Focal Adhesion KEGG Pathways/ Cancer -1.022 1.393 1.354 1.350 0.464 0.082 0.020 0.045 -1.492 0.926 0.704 0.748 0.092 0.682 0.917 1.000 Rank in Ordered Dataset Rank in Ordered Dataset Rank in Ordered Dataset Rank in Ordered Dataset FIG.1.—Characterization of the serum response in human and chimpanzee ﬁbroblasts. (A) Overview of experimental procedure. (B)Log fold-change in gene expression versus P-value of CSR genes between T0 and T12. Positive log fold-change indicates higher level of gene expression at T12. Solid blue line indicates P-value 0.1. Points are CSR downregulated (red) and upregulated (blue) genes. (C) Plots of enrichment scores and distribution of a priori gene sets within the expression set, rank-ordered by differential expression between T0 and T12. Red bars below plots indicate clusters of downregulated CSR genes at the bottom of the ranked list. Inset values are normalized enrichment score (black), and false discovery rate (red). then incubated in starvation media (0.1% FBS) for 48 h, after (Trapnell et al. 2009). Counts per gene were determined using which the growth media was replaced. Collections were then HT-Seq (Anders et al. 2015) for genes with clear orthologs in done at time 0, 12 h, and 24 h. All cells for the RNA-Seq and human and chimpanzee (Blekhman et al. 2010). The data DNase-Seq experiments were from the same batch and col- were normalized using edgeR (Robinson et al. 2010)with a lected at the same time. For the RNA-Seq assays, cells were GLM for multifactor experiments (McCarthy et al. 2012), so rinsed with Qiazol (Qiagen) and vortexed. The RNA was iso- that all time point expression was normalized under one lated using an RNeasy kit (Qiagen), with a DNaseI treatment. model, unless speciﬁc time points are mentioned. 199 clones For the DNase-Seq assays, 20 million cells were spun down representing 165 genes that were previously identiﬁed as be- and slowly frozen in freezing media (Gibco) to 80 C. ing “cell cycle” genes that change through the cell cycle re- gardless of the serum challenge were removed (Chang et al. 2004; Whitﬁeld et al. 2002). RNA-Seq Library Preparation NGS libraries were prepared using the NEB RNA-Seq library kit DNase-Seq Library Preparation for Illumina, and sequenced on a HiSeq at Duke University Genomics Core. Sequences were mapped to the species- Library preparation was performed as in Song and Crawford speciﬁc genome (hg19 and panTro3) using Tophat v.1.4.1 (2010), and sequencing was performed on a HiSeq at Duke 828 Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Chimpanzee Human Enrichmet Score Enrichmet Score −0.6 −0.4 −0.2 0.0 −0.4 −0.3 −0.2 −0.1 0.0 −0.1 0.0 0.1 0.2 0.3 0.1 0.2 0.3 0.4 0.5 p-value 0.0 0.2 0.4 0.6 0.8 1.0 −0.10 0.00 0.10 0.20 0.0 0.1 0.2 0.3 0.4 0.0 0.2 0.4 0.6 0.8 1.0 0.2 0.1 0.0 0.1 0.2 0.0 0.1 0.2 0.3 0.4 Human and Chimpanzee GBE University Genomics Core. Sequences were trimmed to analysis to test for enrichment of GO and KEGG gene sets 20mer lengths and barcodes removed. In order to compare associated with cancer pathways, wound healing, cell adhe- DHS sites between species, we Bowtie-mapped (Langmead sion, and for CSR genes. Data in the form of raw read counts et al. 2009) reads to appropriate genomes and brought chim- were input to test for enrichment between time points during panzee coordinates to human space using liftOver (Hinrichs the ﬁrst 12 or 24 h of the serum response. A rank-ordered list et al. 2006). We called peaks with MACS (Zhang et al. 2008) of genes that are statistically different between time points is using a lower P-value threshold of 1e-5 and found an average created, and the positions of a predeﬁned set of genes are of 150,000 peaks in human samples, and 116,000 peaks determined within this list. An enrichment score is calculated in chimpanzee samples. We counted sites as active if there based on a running-sum statistic that increases when a gene is was a DHS signal in at least 1 replicate. To compare chromatin present in the rank-ordered list and decreases when it is not. DNaseI sensitivity at corresponding locations between sam- ples, we deﬁned windows by intersecting DHS sites across Estimation of Differential Expression all species using BEDTools (Quinlan and Hall 2010). Our ap- We used edgeR (McCarthy et al. 2012; Robinson et al. 2010) proach to deﬁning DHS sites between replicates and between to perform differential expression analysis on RNA-seq and species by using shared “windows” is also graphically DNase-Seq data sets. Brieﬂy, edgeR ﬁts read counts to a neg- explained in the supplementary material (supplementary ﬁg. ative binomial generalized linear model and performs a likeli- S1, Supplementary Material online). This gave 379,723 win- hood ratio test to identify differences between groups. We dows between human and chimpanzee samples. We ﬁltered performed this analysis between time points within species to windows to exclude those <50 bp or >2,000 bp, which gave characterize genes that change expression and DHS sites that 264,091 sites. The windows from our study include DHS sites change activity during the serum response. To characterize that are shared between species, and those that are species- between species differences, we performed the analysis at speciﬁc. 125,411 sites were shared, 95,983 were human- each time point between human and chimpanzee. speciﬁc, and only 42,697 were chimpanzee-speciﬁc. In any given sample newly deﬁned windows may cover zero or mul- tiple DHS sites. To assign values to each new set of coordi- DAVID Enrichment Analysis nates in each sample, we chose the DHS site with the lowest We performed differential expression analysis between hu- P-value as representative of the activity within each window. man and chimpanzee at each of four time points in our ex- periment. Genes that were signiﬁcantly (FDR< 0.1) DHS Window Overlap with ENCODE Data upregulated in each species were selected for enrichment analysis. We sought to investigate enrichments within differ- Data were downloaded from ENCODE that were generated in entially expressed genes in comparison to a background set of DNase-Seq experiments in human ﬁbroblasts, cancer cell lines, genes expressed in ﬁbroblasts. Thus, we used the Database hepatocytes, pancreas, or cerebellum tissues (supplementary for Annotation, Visualization and Integrated Discovery DAVID table S2, Supplementary Material online). Additionally, (Huang et al. 2009a, 2009b), which tests for enrichment in DNase-Seq data from human ﬁbroblasts and LCL cells the context of a user-deﬁned background. DE gene IDs were (Shibata et al. 2012) were used in our ﬁbroblast comparison. supplied to DAVID and the set of all genes active in human Subsets of our DHS sites were selected to test for overlap with and chimpanzee ﬁbroblasts was used as the background. We external data. Windows were ﬁrst screened for size and only tested for enrichment using GO biological process annota- those between 50 bp and 2 kb were selected. Additionally, a tions for our DE genes and molecular function of subsets of set of “human windows” was selected by removing DHS sites differentially expressed genes within particular biological pro- from the 50 bp–2 kb set that did not show any DHS signal at cess categories. For analysis of human enrichments, catego- any time point in any human sample. Overlap between our ries with P-values< 0.1 were characterized, whereas for DHS sites and external data was measured using BEDTools chimpanzee enrichments, we explored categories with P-val- (Quinlan and Hall 2010). The exact command used to test for ues< 0.25. This less stringent threshold was used for chim- overlap was: Bedtools intersect -u -a OurWindows -b panzee due to the low numbers of categories with signiﬁcant ENCODE_XX_Windows> XX_overlap. enrichment values. Gene Set Enrichment Analysis Species-Speciﬁc and Serum-Response-Speciﬁc We tested for enrichment of 12 gene sets downloaded from Enrichments the Molecular Signatures Database (MSigDB) using the Gene Set Enrichment Analysis (GSEA) desktop graphical user inter- In order to determine which categories were enriched at every face (Subramanian et al. 2005). GSEA tests if a predeﬁned set time point for a particular species, we performed enrichment of genes has a statistically signiﬁcant association with one of analysis using DAVID (Huang et al. 2009a, 2009b). Category two biological states (time points in our assay). We used this lists were read into R and were intersected to identify Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 829 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Pizzollo et al. GBE categories that were present in all time points for either hu- sets prepared from log values of mean DHS activity among man or chimpanzee. To identify BP categories that were replicates for shared or species-speciﬁc DHS sites. The fuzziﬁer enriched during the serum response but not before, for for each expression set was selected with the mestimate both human and chimpanzee, we read category lists into R, function. Minimum centroid distances were calculated obtained union sets for the Pre and 0 h time points (early set), for a range of cluster numbers using Dmin, and an optimal and then found the set differences between the 12 or 24 h number of clusters was chosen to select the lowest cen- enriched categories and the early set. troid distance with the lowest number of clusters. For clus- ters that represent increases or decreases in DHS activity during the serum response, we selected DHS sites within Analysis of Positive Selection in Genes and Promoters each cluster with a minimum membership value of 0.6, We tested for signs of positive selection in protein coding and identiﬁed genes closest to each DHS site using UCSC regions of genes by comparing rates of nonsynonymous (Karolchik et al. 2004) gene coordinates for genes with and synonymous substitutions (dN/dS). Data were collected clear orthologs in human and chimpanzee (Blekhman from Ensembl for 12,865 human protein-coding genes and et al. 2010). We tested for biological process category en- their homologs in chimpanzee (Zerbino et al. 2018). We also richment from these genes with GOrilla (Eden et al. 2009), looked at dN/dS for 973 genes that are part of relevant which allows testing large data sets against a background. cancer-related categories (Core Serum Response, GO Cell The set of all genes active in human and chimpanzee ﬁbro- Matrix Adhesion, GO Extracellular Matrix, GO Regulation of blasts was used as the background. Cell Adhesion, GO Response to Wounding, GO Wound Healing, KEGG Basal Cell Carcinoma, KEGG, Cell Adhesion Molecules, KEGG Focal Adhesion, KEGG Pathways in Cancer, Results Mishra Carcinoma Associated Fibroblast UP). Gene lists were The Serum Response in Both Human and Chimpanzee collected from the MSigDB (Subramanian et al. 2005). Fibroblasts Is Similar to the Established CSR Additionally, we looked for signs of positive selection in human promoter sequences (5 kb regions upstream of CSR genes are upregulated or downregulated upon serum transcription start sites) using code from Haygood et al. challenge; Chang et al. (2004) used microarrays to identify (2007) available on GitHub (https://github.com/ofedrigo/ a set of 512 genes they deﬁned as part of the CSR. To deﬁne TestForPositiveSelection) for 5,137 genes with clear orthol- this response they used the assay described above, where ogy in chimpanzee and rhesus macaque (Macaca mulatta), ﬁbroblasts are grown in vitro, are starved of serum for 48 h, which we use as an outgroup in this analysis. Brieﬂy, this and are subsequently re-exposed to normal levels of serum in code runs using HyPhy software (Pond et al. 2005), and the culture medium. The 512 CSR genes are now part of a calculates nucleotide substitution rates in promoter curated data set in the MSigDB (Subramanian et al. 2005), sequences and compares this to neutral substitution rates which includes 212 upregulated and 209 downregulated in nearby intronic regions (ﬁrst intron of a genes were ex- genes. As a ﬁrst analysis of our data, we wanted to see a cluded). P-valueswere used toidentify promoters with replication of this response in our cells and gene expression signiﬁcantly higher rates of substitution on the human platform. Although we compare data compared across plat- branch. forms, we expect to ﬁnd similar patterns of expression. The same sets of genes were tested in both studies, and previous comparisons of RNA-Seq and microarray data show signiﬁ- Identifying Putative Transcription Factor Binding Sites cant correlation of expression proﬁles between platforms Using JASPAR (t Hoen et al. 2008; Zhao et al. 2014). In order to identify known transcription factor binding motifs We measured the serum response in ﬁbroblasts from that are contained within our DHS sites, we collected hg19 humans (four biological replicates) and chimpanzees (three sequences corresponding to our DHS coordinates and biological replicates), examining gene expression by RNA- scanned these sequences for known motifs that are described Seq at four time points (ﬁg. 1A). Of 421 CSR genes, 324 in the JASPAR database (Sandelin et al. 2004). We searched overlapped genes in our results. Part of the reason we test a for all Homo sapiens motifs on both þ and – strands and subset of these genes is because throughout our analysis we speciﬁed a minimum score of 100% using R packages only compared genes that have clear orthologs in both TFBSTools (Tan and Lenhard 2016) and JASPAR2016 humans and chimpanzees. In both species, the majority of (Mathelier et al. 2016). CSR genes increase or decrease expression levels as expected based on the work of Chang et al. (2004) (ﬁg. 1B and sup- Fuzzy Clustering Analysis plementary ﬁg. S2, Supplementary Material online). In human ﬁbroblasts, 84% and 89% of CSR-Up genes are upregulated We performed soft clustering of DHS data using mFuzz at 12 and 24 h, respectively, and in chimpanzee ﬁbroblasts, (Futschik and Carlisle 2005), using standardized expression 830 Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Human and Chimpanzee GBE Table 1 73% and 87% are upregulated at 12 and 24 h. Similarly, Differential Gene Expression between Species at Each Time Point (FDR 88% of CSR-Down genes are downregulated in human ﬁbro- 10%) blasts, and in chimpanzee ﬁbroblasts, 90% and 84% are Number of DE Genes (FDR 10%) downregulated at 12 and 24 h. Of the genes that increase or decrease expression as expected, there are comparable num- Pre T0 T12 T24 bers of signiﬁcant (FDR 10%) differences over time in both Higher in human 910 925 983 856 species. These changes in expression through the time-course Higher in chimpanzee 1,460 2,203 1,431 2,350 of the assay are much as expected for a core biological pro- Total DE genes 2,370 3,128 2,414 3,206 cess. Beyond these CSR genes, however, we see between- Ratio chimpanzee DE:human DE 1.60 2.38 1.46 2.75 species differences in functional categories of genes that are involved in important aspects of physiology. Wound Healing and Cancer Pathway Genes Increase homeostasis or cell signaling and protein modiﬁcation pro- Expression in Human, but Not Chimpanzee, Fibroblasts cesses. As well, several processes related to cell death, devel- during the Serum Response opment or morphogenesis, and response to stimuli are enriched. At 12 and 24 h in chimpanzee ﬁbroblasts, there In order to explore the biology of the serum response across were only 2 and 37 categories, respectively, with P-val- species, we tested for gene ontology enrichment categories ues 0.05. The enriched processes at 12 h are cell prolifera- between time points that characterize the serum response tion and proximal/distal pattern formation, and at 24 h the pathways using GSEA (supplementary table S3, majority of processes relate to cell cycle and metabolic pro- Supplementary Material online; Subramanian et al. 2005). cesses. The processes identiﬁed in humans describe a broad Although CSR upregulated genes are signiﬁcantly response to stress and stimuli and show that ﬁbroblasts initi- (FDR< 0.1) enriched at 12 h in both human and chimpanzee, ate mechanisms to cope with a changing external environ- the enrichment for CSR downregulated genes is only signiﬁ- ment due to the presence of serum. On the other hand, gene cant in chimpanzee (ﬁg. 1C and supplementary table S3, expression in chimpanzee ﬁbroblasts is not enriched for any Supplementary Material online). In human ﬁbroblasts, how- one aspect of physiology related to the serum response; in- ever, GO cell adhesion, Kyoto Encyclopedia of Genes and stead it is much more diffuse over biological processes. These Genomes (KEGG) focal adhesion, and KEGG cancer pathway results suggest that the chimpanzee cells are reacting in a less genes are signiﬁcantly enriched at the 12-h time point in hu- coordinated way to regulate cell state during times of stress. man, but not in chimpanzee. These results suggest that al- though the CSR is similar between species, there are important wound healing and cancer-related pathways that Genes with Higher Expression in Human Fibroblasts Are are upregulated in human ﬁbroblasts but not in chimpanzee. Enriched for Development, Adhesion, and Angiogenesis Categories Human Fibroblasts Have a Coordinated Homeostasis and We were interested if differences in gene expression could Cell Signaling Response to Serum inform about fundamental differences in physiology between In order to understand how human and chimpanzee ﬁbro- the two species. To explore this, we looked for BP categories blasts respond to serum on a gene-by-gene level, we next that were enriched at all time points for human or for chim- performed differential expression analysis between species panzee. These common categories may represent broad bio- at each time point (McCarthy et al. 2012; Robinson et al. logical processes that are uniquely elevated in one species. 2010). This analysis shows that there are more DE genes Genes that were more highly expressed in human at Pre, (FDR 10%) with higher expression levels in chimpanzee T0, T12, and T24 shared enrichment (P 0.1) for 18 catego- than human in every comparison, with an average of about ries across time points (ﬁg. 2). These include 11 human BP twice as many DE genes with higher expression in chimpan- categories that represent development, morphogenesis, or zee (table 1). differentiation, four categories related to locomotion or ad- To investigate processes that are unique to the serum re- hesion, and three categories related to angiogenesis and sponse, we measured enrichment for BP categories (Huang blood vessel development. et al. 2009a, 2009b) from differentially upregulated genes Molecular function enrichment of the genes in the 11 within each species, and identiﬁed the categories that were development categories indicates that transcription factor enriched during the serum response (T12 and T24), but not at activity is highly enriched at each time point, with P-val- earlier time points (Pre and T0). At 12 and 24 h in human ues 1.1 10 and at least 3.3-fold enrichment. ﬁbroblasts, there were 23 and 27 BP categories, respectively, Transcription factors within this set of 22 genes have nor- with P-values 0.05 (supplementary table S4, Supplementary mal roles in embryogenesis and angiogenesis during Material online). Most of these processes describe wound healing, but are often aberrantly regulated in Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 831 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Pizzollo et al. GBE FIG.2.—Common GO terms across time points. GO BP categories signiﬁcantly higher in human and chimpanzee at all time points. Bars represent mean values across 4 time points for ln P-value (below axis) and fold-enrichment (above axis). Colors group similar biological processes and vertical lines represent 1 standard deviation from the mean. cancers (Abate-Shen 2002; Gilkes et al. 2014). nonetheless gain insight into possible biological characteristics Angiogenesis is an essential process in normal wound heal- common to chimpanzee ﬁbroblasts. ing that allows for the delivery of oxygen and nutrients to the site of injury, and plays an important role in the for- Genes with Signs of Positively Selected Changes in Protein mation of new tissue (Tonnesen et al. 2000). In a similar Coding Regions Are Not Enriched among Cancer-Related fashion, development of new vasculature is essential for Genes the continued growth of tumors and metastasis (Yadav et al. 2015). Together, these data show that genes with While differences in levels of gene expression play an impor- higher expression in human ﬁbroblasts enrich for critical tant role in controlling phenotypes (Wray et al. 2003), nucle- parts of the wound healing process at all time points. otide level differences in protein coding genes within relevant Using the same between-species differential expression disease pathways may also contribute to differential disease and biological process enrichment data set we looked for susceptibility (Puente et al. 2006). To explore genetic differ- processes that were shared in chimpanzee between all time ences between species, we looked at nucleotide level differ- points. Here, we found no common BP categories with sig- ences in human and chimpanzee orthologs from publicly niﬁcant enrichment P-values (P 0.1). However, in order to available genome sequence (Zerbino et al. 2018). explore some categories that may be enriched in chimpan- We ﬁrst looked at rates of nonsynonymous substitution zees, we relaxed this requirement to categories with P-val- (dN) in 973 genes that are important in wound healing and ue 0.25 and found eight categories that were shared cancer pathways and are part of well-deﬁned gene sets in the between time points (ﬁg. 2). Similar to our analysis of pro- MSigDB. We made estimates for each gene by selecting iso- cesses unique to the serum response, the lack of categories forms with the highest dN values. The average dN values for that are enriched across time points indicates that elevated cancer-related genes and the full set of 12,865 homologues gene expression in chimpanzee ﬁbroblasts is distributed across quantiﬁed in this study are comparable at 0.008 and 0.0086, many processes, as opposed to the stronger enrichments we respectively indicating that there is no signiﬁcant enrichment see in the human gene expression data. As an exploratory for nonsynonymous substitutions in cancer-related genes analysis, we also note a few categories that stand under the (two-sided Fisher’s exact test P-value¼ 1). less stringent P-value threshold applied to chimpanzee enrich- In order to see if there are signs of positive selection in ments. Here, DNA catabolism and fragmentation, and apo- coding regions of cancer-related genes, we looked at ratios ptotic nuclear change categories have a high fold-enrichment, of nonsynonymous to synonymous substitutions (dN/dS)for between2.9 and6.3, but P-values are between 0.02 and 0.2. all genes tested in our study (supplementary table S5, At the 24-h time point, cellular component organization has a Supplementary Material online). Only 5.14% (50 of 973) of very low P-value (4 10 ; which accounts for the high stan- cancer genes have dN/dS> 1, whereas this rate is 7.55% (971 dard deviation), but does not have a high fold-enrichment. of 12,865) for all genes in our study (two-sided Fisher’s exact Thus, while the error rates associated with these enrichments test P-value¼ 0.00851). This lower rate suggests that there is increase with the less stringent P-value applied to these data, no enrichment for positively selected protein coding changes in the absence of highly signiﬁcant enrichments, we can in cancer-related genes as a whole. Nonetheless, because 832 Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Human and Chimpanzee GBE positively selected changes in individual genes could contrib- extracellular matrix remodeling, NAALAD2, a peptidase that ute to differential phenotypes, we looked more closely at hydrolyses N-acetyl-aspartyl glutamate and glutamate and a which genes have the greatest signals of positive selection. marker of prostatic carcinomas, and ACVRL1, a receptor for We looked in the set of cancer-related genes with dN/ TGF-b family of ligands. Outside of those that contribute to dS> 1 and overlapped that with genes that have signiﬁcantly enriched processes are genes that are elevated in human higher expression in humans. In the gene coding for CXCL6, a ﬁbroblasts at all time points. These include ALS2CL, RGS20, ligand for chemotaxis and angiogenesis, dN¼ 0.0167, which and SNX16 involved in cell signaling, DPT, which has a role in is in the 90th percentile of dN scores of all 12,865 genes in our cellular adhesion, and PFKFB3 involved in the control of gly- study. The reported dS value is 0, preventing an exact calcu- colysis. These results suggest that while there are individual lation of dN/dS, but highlighting the excess nonsynonymous genes that are signiﬁcantly upregulated in human ﬁbroblasts substitution rate. The genes encoding MXI1 and NKX3-1, that have signs of positive selection, these are not focused on both tumor suppressors, also have signatures of positive se- any one biological process. Selection around these genes, lection where dN¼ 0.0034 and 0.0053, respectively, and however, does show that there are changes in processes im- dS for each¼ 0. We further looked at genes with the highest portant for ﬁbroblast function, wound healing, and cancer dN/dS ratios to explore biological processes where we see progression. Because chronic wound healing processes are strong signals of positive selection. Here, we see that genes co-opted by developing tumors, adaptation for higher expres- that have roles in spermatogenesis (DNAL1, TMEM225, sion within these processes in humans could help explain in- TRIM69, TMCO5A), metabolism (DHRS12 and NUDT17), im- creased disease susceptibility. mune responses (IL15RA and TRAFD1), or apoptosis (DFFA) To look more closely at how positive selection may be show the highest signals of positive selection (supplementary shaping gene expression in humans, we used methods table S5, Supplementary Material online). These data align adapted from Haygood et al. (2007) to test for signs of selec- with previous reports that these biological processes contain tion in promoter regions of genes in our study. We used a set an excess number of genes with positively selected changes of 5,137 genes that have orthologs in human, chimpanzee, (Bustamante et al. 2005), andit ispossible that genes with and rhesus macaque and compared substitution rates in pro- such changes could contribute to overall differences in disease moters to intronic sequence, which serves as a measure of susceptibility between species. While our data do show that neutral substitution rates. Here, we see 3.5% of promoters there are some changes in genes that have possible roles in have signs of positive selection indicating that these events are disease susceptibility, the impact of these individual changes relatively rare in humans (likelihood ratio test P-value 0.01). on overall disease incidences is not clear. It is possible that To ﬁnd out where these selection events are happening, we there could be some protein coding changes that contribute performed an enrichment analysis for GO biological pro- to disease processes, but there is not a statistical enrichment cesses. Among the most enriched processes are those related for positively selected changes in cancer-related genes as a to neural function including anion transport, sensory percep- whole. tion of light stimulus, and visual perception, which is in agree- ment with the Haygood et al. (2007) ﬁndings that neural genes have experienced recent positive selection in proximal Genes Upregulated in Human Fibroblasts Show Signatures regulatory regions (supplementary table S6, Supplementary of Positive Selection Material online). Genes contributing to these enrichment in- Global analysis of cis-regulatory sequences within promoter clude GLRA1, which mediates central nervous system post- regions has revealed evidence of positive selection in humans synaptic inhibition, FAM161A, involved in retinal progenitor for genes involved in neural development and glucose metab- cell proliferation, NDP, involved in retinal vascularization, and olism (Haygood et al. 2007). Because we see particular bio- CNGA1, which is involved in phototransduction. Also included logical processes enriched in humans at all time points and are transcriptional regulators MAP2K6 that regulates stress speciﬁcally during the serum response, we wondered if reg- induced cell cycle arrest, transcription activation, and apopto- ulatory regions near genes that contribute to these enrich- sis, and POU6F2 which is a tumor suppressor involved in ments show evidence of positive selection. In humans, we nephroblastoma predisposition (Di Renzo et al. 2006). While see that enriched biological processes are important for signs of positive selection do not fully explain differential gene wound healing and cancer progression, and selection in non- expression in humans, our results here do agree with previous coding regions around these genes could suggest adaptation reports that neural-related processes and control of transcrip- in the form of changing gene regulation. We compared genes tion are enriched for signs of positive selection. While we are identiﬁed by Haygood et al. (2007) that show signs of positive beginning understand how individual genes can show posi- selection in regulatory regions with those that contribute tively adaptive function and also contribute to disease pro- to biological processes that are upregulated in human ﬁbro- cesses (Crespi and Summers 2006), we understand less about blasts and found some overlap between the two data sets. how adaptation and antagonism occur broadly across func- These include MMP8, a metallopeptidase that contributes to tional biological process categories. Nonetheless, signs of Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 833 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Pizzollo et al. GBE AB C Differential Chromatin DNaseI Hypersensitivity Percent Active DHS Sites Number of Active DHS Sites Human-Specific Sites Chimpanzee-Specific Sites Human-Shared Sites Chimpanzee-Shared Sites Pre T0 T12 T24 Pre T0 T12 T24 Pre T0 T12 T24 Time Time Time FIG.3.—Differential chromatin accessibility in human and chimpanzee ﬁbroblasts. (A) Differential chromatin DNaseI hypersensitivity was determined using a 5% FDR from a likelihood ratio test. (B)and (C) Active DHS sites are locations in which a DNaseI signal with a P-value of at least 1e-5 is present in at least one replicate. Red lines with ﬁlled markers represent activity of human-speciﬁc DHS sites. Blue lines with ﬁlled markers represent activity of chimpanzee- speciﬁc DHS sites. Red lines with empty markers represent activity of shared DHS sites in human ﬁbroblasts. Blue lines with empty markers represent activity of shared DHS sites in chimpanzee ﬁbroblasts. adaptation around individual genes can offer some insight between species increases accessibility in human ﬁbroblasts into how physiological responses change over evolutionary and decreases in chimpanzee. This, along with the larger time. number of human-speciﬁc sites indicates greater chromatin accessibility in general (Pre and T0), and in response to stress (T12 and T24) in human ﬁbroblasts. Human Chromatin Has Signiﬁcantly More Open Not all of the DHS sites identiﬁed are active at all time Chromatin than Chimpanzee points. In terms of percentage of active sites at any given Layering in a second data set, we examined global changes in time point, more shared sites are active than species-speciﬁc chromatin accessibility over the challenge time points using sites (ﬁg. 3B). Interestingly, even though humans have twice DNase-Seq (Boyle 2008). Regions of open chromatin are sus- as many active species-speciﬁc sites as chimpanzee (ﬁg. 3C), ceptible to DNaseI cleavage (Keene et al. 1981)and sites hy- the percentage of these sites that are active through the assay persensitive to DNaseI mark many types of regulatory is comparable (ﬁg. 3B), suggesting that technical differences elements (Gross and Garrard 1988). As a preliminary charac- in genome annotation are not responsible for higher levels of terization, we looked to see if the locations of our DHS sites species-speciﬁc DHS sites in human. While the total number have been identiﬁed in previous studies that used DNase-Seq, of active DHS sites in human ﬁbroblasts remains relatively we scanned for the presence of transcription factor binding stable, chimpanzee chromatin shows a decrease in accessibil- motifs, and looked at how openness of DHS sites changes ity during the ﬁrst 12 h of the serum challenge, particularly at relative to proximity to transcription start sites (supplementary shared DHS sites (ﬁg. 3C). These data suggest that chimpan- text, Supplementary Material online). Together, these charac- zee chromatin responds to the stress of the serum challenge teristics suggest that the DHS sites we identiﬁed contain func- with a general decrease in accessibility. tional elements and help to validate our DNase-Seq data set. In order to compare the activity of DHS sites between spe- DHS Sites That Cluster in Patterns of Opening and Closing cies, we used a 5% FDR to identify signiﬁcant differences at Chromatin Reﬂect Functional Control of Transcription and each time point. There are 9,000–10,000 sites with signif- Adhesion Processes icantly different DHS signals at every time (ﬁg. 3A). Importantly, not all DHS sites are present in both human In order to explore temporal patterns of chromatin state, we and chimpanzee ﬁbroblasts. To more deeply investigate performed fuzzy clustering (Futschik and Carlisle 2005)of the how chromatin changes during our assay, we identiﬁed sites mean -10log P-values of DHS sites within each species dur- that were shared between species, and those that are species- ing the serum challenge. Most clusters have a bimodal shape, speciﬁc. Many of the signiﬁcant differences in DHS signal are, which generally describes a site as open or closed at a given appropriately, in species-speciﬁc DHS sites. However, of the time point. Interestingly, the top three cluster shapes with the sites that are shared between species, there are signiﬁcantly highest membership values are the same in both human and (Fisher’s exact test P-value 2.196 10 )more sites with chimp (supplementary ﬁg. S5, Supplementary Material on- higher DNaseI sensitivity in human at all time points. During line). There are particular clusters that are interesting in terms the serum challenge, chromatin containing DHS sites shared of activity, speciﬁcally in response to the serum challenge. 834 Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Number of Sites 0 1000 2000 3000 4000 5000 6000 Percent Active 0 20406080 100 Number Active (thousands) 020 40 60 80 100 120 Human and Chimpanzee GBE A B Pre T0 T12 T24 Pre T0 T12 T24 Pre T0 T12 T24 Pre T0 T12 T24 Pre T0 T12 T24 Pre T0 T12 T24 Human Chimpanzee Human Chimpanzee CD Wound Healing Metabolism Transcription Signaling Apoptosis & Proliferation Adhesion & Migration Development & Differentiation 40 0 40 40 0 40 Number of Categories Number of Categories FIG.4.—Fuzzy clustering and ontology enrichments. Clusters represent either an (A) increase or (B) decrease in chromatin accessibility following serum replacement. (C)and (D) Common GO terms generated from genes nearest DHS sites belonging to clusters that indicate (C)increasing or (D) decreasing chromatin accessibility. These represent DHS sites in which changes in chromatin Table 2 DHS Sites and Genes Associated with Clusters Representing Opening and openness occur: At the beginning of the challenge and per- Closing Chromatin during the Serum Challenge sist, at the end of the challenge, or at the beginning of the challenge that revert (ﬁg. 4A and B). In general, there are Chromatin Opening Chromatin Closing multiple DHS sites in proximity to each gene. While there Human Chimpanzee Human Chimpanzee appear to be comparable numbers of DHS sites opening Number DHSs 19,974 13,282 4,462 13,350 and closing during the serum challenge in chimpanzee, there Number genes 7,286 5,910 3,163 6,501 are 4.5 as many (19,974/4,462) DHS sites in human that ratio DHSs/gene 2.74 2.25 1.41 2.05 ﬁt in clusters representing chromatin opening compared with clusters representing chromatin closing (table 2). These 4,462 DHS sites that represent chromatin closing are found in prox- imity to 3,163 genes. Thus, when DHS sites are closing in that represent chromatin opening or closing during the serum human ﬁbroblasts, there is frequently only about one site challenge. Among the enriched categories, particular pro- that closes per gene, whereas in chimpanzee there are 2 cesses were common between species. We computationally sites that close per gene. When DHS sites are opening in hu- grouped these processes into representative categories using man there are 2.74 sites per gene, and in chimpanzee, there key terms (ﬁg. 4C and D). For example, the “Development” are 2.25 sites per gene. Not only are there a larger absolute category represents biological processes of development, number of sites of opening chromatin in human, these sites morphogenesis, and differentiation, and the “Growth” cate- are more concentrated around genes than they are in chim- gory represents processes of growth, proliferation, death, and panzee, and the change to a closed state in DHS sites in hu- apoptosis. All processes were enriched with P-values< 10 . man is less concentrated around genes than in chimpanzee. Grouping these categories shows common themes among As a way to investigate the biological processes that may the enrichments and shows similar themes to gene expression be controlled by these regions of opening or closing chroma- enrichments. Development categories are among the most tin, we performed gene ontology enrichment analysis (Eden prevalent, and similar to gene expression enrichments, tran- et al. 2009) of genes closest to DHS sites belonging to clusters scription factor activity is one of the most highly enriched Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 835 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Expression Change −1.0 0.0 0.5 1.0 −1.0 0.0 0.5 1.0 1.5 −1.0 0.0 0.5 1.0 1.5 Expression Change −1.0 −0.5 0.0 0.5 1.0 −1.5 −0.5 0.0 0.5 1.0 −1.5 −0.5 0.0 0.5 1.0 Pizzollo et al. GBE molecular functions from these genes. In agreement with some of these may be explained by updated gene models and gene expression enrichments, motility, adhesion, migration, differences in experimental platforms between the studies, and growth, death, apoptosis, and proliferation are enriched changing from microarrays to RNA-Seq. By using this assay in up- and downregulated clusters. Although there are a small in two closely related species with prominent phenotypic dif- number of categories related to wound healing, it seems this ferences in wound healing and cancer rates, we are able to response is more highly enriched in chimpanzee than human. investigate genetic differences during the serum response in this physiologically relevant cell type that have evolved over a relatively short timescale (5–7 million years; Chen and Li Positive Correlations Exist between DHS Sites and Levels of 2001; Langergraber et al. 2012). Gene Expression Focusing on gene expression, our RNA-Seq data suggest that the CSR is similar between species; yet, there are impor- Next, we wanted to bring our two data sets (RNA-Seq and tant wound healing and cancer-related pathways that are DNase-Seq) together to examine how DHS activity and distri- upregulated in human ﬁbroblasts but not in chimpanzee. bution compares with expression. Linking gene regulation While chimpanzees have more genes with higher levels of and gene expression at a whole-genome level is notoriously expression than humans, these are unfocused and not difﬁcult because regulatory elements can act at large distan- enriched for speciﬁc biological processes. Humans, on the ces from, and independent of orientation to, target genes. In other hand, have fewer genes with higher levels of expression the absence of annotated relationships between regulatory than chimpanzee, but these are contained within particular elements and target genes, linking putative regulatory ele- biological processes and pathways. Genes encoding transcrip- ments and genes based on proximity is the most feasible route tion factors enrich process of development, morphogenesis, to explore these relationships on a genome-wide scale. Boyle or differentiation in human ﬁbroblasts. Cell adhesion pro- (2008) have shown that there is a low correlation when di- cesses are also enriched in humans at all time points. This rectly comparing expression and the degree of hypersensitiv- difference has been identiﬁed previously in gene expression ity, but there is a signiﬁcant difference between DNaseI and cellular focal adhesion comparisons in human and chim- sensitivity at the TSS of low or no expression compared with panzee ﬁbroblasts (Advani et al. 2016). These molecules play genes with moderate or high expression. Additionally, they critical roles in the function of ﬁbroblasts by mediating cell– show that many of the most active DHS sites are in promoters cell and cell–matrix interactions and have important roles in and within the ﬁrst exon. cell responses to external stimuli (Calvo et al. 2013; Clayton To look at relationships between DHS signal and gene ex- et al. 1998). pression, we calculated the total DHS signal for all DHS sites To look at how genetic differences between species could closest to all TSSs, and performed a Spearman’s correlation be affecting phenotypes, we looked at how protein coding test against gene expression values for these genes. In both regions and gene promoters differ between species. We human and chimpanzee at each time point, Spearman’s rho found signs of positive selection in protein coding genes was between 0.24 and 0.25, indicating that there is a weak that are part of cancer-related pathways, but no enrichment but positive correlation between DHS activity and gene ex- among those processes. Some genes that are differentially pression (supplementary ﬁg. S6, Supplementary Material on- expressed in humans have signs of positive selection and line). There is a stronger correlation, however, between log are part of cancer pathways, but a direct relationship between fold-change in gene expression between species and the ratio genetic changes and species phenotypes is difﬁcult to make. of active DHS sites per gene. At each time point, the Likewise, our analysis of positive selection in promoter regions Spearman’s rho is between 0.327 and 0.487 and P-value found enrichments for neural-related processes, but these is< 2.2e-16. These data show that positive relationships exist events are rare, only occurring in 3.5% of the promoters between DHS activity and gene expression based on proximity tested. Here, we found that promoters with signs of positive of DHS sites and TSSs, but individual metrics describing DHS selection have roles in neural function including visual system activity are not strong predictors of gene expression at the development and differentiation, and anion transport. These closest TSS. positively selected changes agree with known differences in species biology, but do not fully explain how selection in up- Discussion stream regulatory regions contributes to gene expression. In response to a serum challenge, ﬁbroblasts undergo a de- The expression level of any gene is dictated by the activity of ﬁned transcription activation proﬁle that mimics the wound regulatory elements that promote or repress transcription. healing response (Chang et al. 2004) and is similar to the However, the relationship between number, location, and ac- expression proﬁle found in tumors (Chang et al. 2005). In tivity of cis-regulatory elements and associated genes globally is our experiments, we found a gene expression proﬁle in not clear. Because we know that DHS sites mark regulatory both species that mimics the CSR described by Chang et al. elements, we can still, however, identify differences in avail- (2005). There are differences in the response; however, and ability of putative regulatory elements available to cells at each 836 Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Human and Chimpanzee GBE time point. Our DNase-Seq data show that human cells have genetic defects and predisposition (American Cancer Society higher levels of chromatin accessibility at all time points. The 2016; Lichtenstein et al. 2000; Stearns and Medzhitov increased chromatin openness is somewhat counterintuitive 2016). Certain environmental factors play a large role in considering the higher levels of gene expression seen in chim- human exposure (American Cancer Society 2016) but not panzee. One would expect that increased chromatin accessi- in chimpanzees, while exposure to carcinogens through bility in human ﬁbroblasts would result in higher levels of gene other environmental factors may be more similar between expression, but this is not the case. This may indicate that hu- species (Varki and Varki 2015). While part of the difference man cells exist in a more poised, or transcription-ready, state in cancer rates between our species is due to these external than chimpanzee cells. Open chromatin data showing a higher factors, genetic differences likely play a role as well. level of chromatin accessibility, and expression data showing Comparative genomics allows for investigation of global dif- signiﬁcant changes in transcription factor activity, together ferences in gene expression and chromatin responses be- suggest that human cells maintain a transcription-ready state, tween humans and chimpanzees. This approach can which could allow for a faster transcriptional response. identify genetic changes that occurred as humans diverged Understanding how these regions of open chromatin from the most recent human–chimpanzee ancestor, which might be driving changes in gene expression remains a chal- may be responsible for particular phenotypes (Olson and lenge. Because regulatory elements can act at large distances Varki 2003), and which may be driven by differential gene relative to their target genes, linking gene expression and expression rather than changes in protein coding regions regulation is difﬁcult. Between our open chromatin and (reviewed in Carroll 2005; Wray et al. 2003). Changes in gene expression data sets, we only found weakly positive the activity of regulatory elements can offer a mechanistic correlations based on sequence proximity. To identify speciﬁc explanation for differential expression between species. links between regulatory elements and a speciﬁc gene expres- These changes can occur rapidly over evolutionary time, sion level will require more targeted experiments, such as lu- such as the 5–7 mya divergence measured here, possibly ciferase assays. driven by positive selection giving rise to new phenotypes. Comparative approaches to studying genomic differences Along with positive adaptations, however, can come side between species make extensions of our knowledge of biol- effects that manifest later in life-history as unfavorable phe- ogy across species. Although some assumptions are made notypes such as disease susceptibilities. These unintended about functional conservation, genetic differences between changes might not be readily visible to selection and may species correlate with known differences in species biology. In be propagated over evolutionary time. The use of evolution- taking a comparative approach to investigating genomic ary comparisons to better understand shifting rates of dis- responses to a well-deﬁned experimental assay in two closely ease between humans and nonhuman primates can be used related species, we begin to explore how genetic changes as a valuable tool for studying genetic factors that confer functionally contribute to differences in a core physiological uniquely human characteristics and disease susceptibilities. process over a relatively short evolutionary time scale. Here, we see that humans and chimpanzees have very different Supplementary Material responses to the same physiological stressor. The human re- Supplementary data are available at Genome Biology and sponse is generally rapid and robust with focused changes in Evolution online. gene expression and chromatin openness around functional groups of genes important for wound healing. This response may be part of a genetic adaptation that allows for quick Acknowledgments mobilization of transcriptional programs to cope with chang- ing extracellular state. The ability to quickly engage robust This work was supported by a National Science Foundation genetic responses to a wound healing stimulus or respond Grant (NSF-BCS-08-27552 HOMINID) to G.A.W. (https:// to other stimuli could have important adaptive function (de www.nsf.gov). The authors thank all the members of our re- Nadal et al. 2011; Lopez-Maury et al. 2008). However, a search groups for valuable discussions of these studies. We strong or prolonged wound healing response in the context thank Lisa Pfefferle for her extensive insight into the experi- of a cancerous lesion could be deleterious. While this exper- mental design, and Trisha Zintel and Norman A. Johnson for imental assay does not describe mechanisms that initiate dis- helpful suggestions and advice during data analysis. ease, it does serve as a way to explore how genetic programs that signiﬁcantly differ between humans and chimpanzees Author Contributions could contribute to increased disease susceptibility in humans. Striking differences in the rates of epithelial cancers exist C.C.B., J.P., and G.A.W. were responsible for conceptualizing between humans and chimpanzees. The lifetime risk for de- the study. W.J.N., Y.S., and A.S. performed the molecular lab velopment of cancer depends on the effects of acute and work. J.P. performed the formal analyses. G.E.C., G.A.W., cumulative exposure to environmental factors, but also on and C.C.B. provided resources and supervised the work. J.P. Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 837 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Pizzollo et al. GBE Finch CE. 2010. Evolution in health and medicine Sackler colloquium: wrote the paper. C.C.B., Y.S., and G.A.W. reviewed and evolution of the human lifespan and diseases of aging: roles of infec- edited the manuscript. All authors gave approval for tion, inﬂammation, and nutrition. Proc Natl Acad Sci U S A. 107(Suppl publication. 1):1718–1724. Futschik ME, Carlisle B. 2005. Noise-robust soft clustering of gene expres- sion time-course data. J Bioinform Comput Biol. 3(4):965–988. Literature Cited Gilkes DM, Semenza GL, Wirtz D. 2014. Hypoxia and the extracellular Abate-Shen C. 2002. Deregulated homeobox gene expression in cancer: matrix: drivers of tumour metastasis. Nat Rev Cancer. 14(6): cause or consequence? Nat Rev Cancer. 2(10):777–785. 430–439. Advani AS, Chen AY, Babbitt CC. 2016. Human ﬁbroblasts display a dif- Gross DS, Garrard WT. 1988. Nuclease hypersensitive sites in chromatin. ferential focal adhesion phenotype relative to chimpanzee. Evol Med Annu Rev Biochem. 57:159–197. Public Health. 2016(1):110–116. Hamilton WD. 1966. The moulding of senescence by natural selection. American Cancer Society. 2016. Cancer Facts & Figures 2016. Atlanta, J Theor Biol. 12(1):12–45. GA: American Cancer Society; 2016. Haygood R, Fedrigo O, Hanson B, Yokoyama KD, Wray GA. 2007. Anders S, Pyl PT, Huber W. 2015. HTSeq – a Python framework to work Promoter regions of many neural- and nutrition-related genes have with high-throughput sequencing data. Bioinformatics 31(2): experienced positive selection during human evolution. Nat Genet. 166–169. 39(9):1140–1144. Babbitt CC, Haygood R, Nielsen WJ, Wray GA. 2017. Gene expression and Hedlund M, et al. 2007. N-glycolylneuraminic acid deﬁciency in mice: adaptive noncoding changes during human evolution. BMC implications for human biology and evolution. Mol Cell Biol. Genomics. 18(1):435. 27(12):4340–4346. Beniashvili DS. 1989. An overview of the world literature on spontaneous Hinrichs AS, et al. 2006. The UCSC genome browser database: update tumors in nonhuman-primates. J Med Primatol. 18(6):423–437. 2006. Nucleic Acids Res. 34(Database issue):D590–D598. Blekhman R, Marioni JC, Zumbo P, Stephens M, Gilad Y. 2010. Sex-spe- Huang DW, Sherman BT, Lempicki RA. 2009a. Bioinformatics enrichment ciﬁc and lineage-speciﬁc alternative splicing in primates. Genome Res. tools: paths toward the comprehensive functional analysis of large 20(2):180–189. gene lists. Nucleic Acids Res. 37(1):1–13. Boyle AP. 2008. High-resolution mapping and characterization of open Huang DW, Sherman BT, Lempicki RA. 2009b. Systematic and integrative chromatin across the genome. Cell 132(2):311–322. analysis of large gene lists using DAVID bioinformatics resources. Nat Bustamante CD, et al. 2005. Natural selection on protein-coding genes in Protoc. 4(1):44–57. the human genome. Nature 437(7062):1153–1157. Iyer VR, et al. 1999. The transcriptional program in the response of human Calvo F, et al. 2013. Mechanotransduction and YAP-dependent matrix ﬁbroblasts to serum. Science 283(5398):83–87. remodelling is required for the generation and maintenance of Karolchik D, et al. 2004. The UCSC Table Browser data retrieval tool. cancer-associated ﬁbroblasts. Nat Cell Biol. 15(6):637–646. Nucleic Acids Res. 32(Database issue):D493–D496. Carroll SB. 2005. Evolution at two levels: on genes and form. PLoS Biol. Keene MA, Corces V, Lowenhaupt K, Elgin SCR. 1981. Dnase-I hypersen- 3(7):e245–1166. sitive sites in drosophila chromatin occur at the 5 ends of regions of Carter AJ, Nguyen AQ. 2011. Antagonistic pleiotropy as a widespread transcription. Proc Natl Acad Sci U S A-Biol Sci. 78(1):143–146. mechanism for the maintenance of polymorphic disease alleles. Langergraber KE, et al. 2012. Generation times in wild chimpanzees and BMC Med Genet. 12:160. gorillas suggest earlier divergence times in great ape and human evo- Chang HY, et al. 2004. Gene expression signature of ﬁbroblast serum lution. Proc Natl Acad Sci U S A. 109(39):15716–15721. response predicts human cancer progression: similarities between Langmead B, Trapnell C, Pop M, Salzberg SL. 2009. Ultrafast and memory- tumors and wounds. PLoS Biol. 2(2):E7. efﬁcient alignment of short DNA sequences to the human genome. Chang HY, et al. 2005. Robustness, scalability, and integration of a Genome Biol. 10(3):R25. wound-response gene expression signature in predicting breast cancer Lichtenstein P, et al. 2000. Environmental and heritable factors in the survival. Proc Natl Acad Sci U S A. 102(10):3738–3743. causation of cancer – analyses of cohorts of twins from Sweden, Chen FC, Li WH. 2001. Genomic divergences between humans and other Denmark, and Finland. N Engl J Med. 343(2):78–85. hominoids and the effective population size of the common ancestor Lopez-Maury L, Marguerat S, Bahler J. 2008. Tuning gene expression to of humans and chimpanzees. Am J Hum Genet. 68(2):444–456. changing environments: from rapid responses to evolutionary adap- Clayton A, et al. 1998. Cellular activation through the ligation of intercel- tation. Nat Rev Genet. 9(8):583–593. lular adhesion molecule-1. J Cell Sci. 111 (Pt 4):443–453. Mathelier A, et al. 2016. JASPAR 2016: a major expansion and update of Crespi BJ. 2010. The origins and evolution of genetic disease risk in mod- the open-access database of transcription factor binding proﬁles. ern humans. Ann N Y Acad Sci. 1206:80–109. Nucleic Acids Res. 44(D1):D110–D115. Crespi BJ, Summers K. 2006. Positive selection in the evolution of cancer. McCarthy DJ, Chen YS, Smyth GK. 2012. Differential expression analysis Biol Rev Camb Philos Soc. 81(3):407–424. of multifactor RNA-Seq experiments with respect to biological varia- de Nadal E, Ammerer G, Posas F. 2011. Controlling gene expression in tion. Nucleic Acids Res. 40(10):4288–4297. response to stress. Nat Rev Genet. 12(12):833–845. McClure HM. 1973. Tumors in nonhuman primates: observations during a Di Renzo F, et al. 2006. The murine Pou6f2 gene is temporally and spatially six-year period in the Yerkes primate center colony. Am J Phys regulated during kidney embryogenesis and its human homolog is Anthropol. 38(2):425–429. overexpressed in a subset of Wilms tumors. J Pediatr Hematol Medawar PB. 1952. An unsolved problem of biology. London: Published Oncol. 28(12):791–797. for the College by H.K.Lewis. Dvorak HF. 1986. Tumors: wounds that do not heal. Similarities between Nielsen R, et al. 2005. A scan for positively selected genes in the genomes tumor stroma generation and wound healing. N Engl J Med. of humans and chimpanzees. PLoS Biol. 3(6):e170. 315(26):1650–1659. Olson MV, Varki A. 2003. Sequencing the chimpanzee genome: insights Eden E, Navon R, Steinfeld I, Lipson D, Yakhini Z. 2009. GOrilla: a tool for into human evolution and disease. Nat Rev Genet. 4(1):20–28. discovery and visualization of enriched GO terms in ranked gene lists. Parker SL, Tong T, Bolden S, Wingo PA. 1997. Cancer statistics, 1997. CA- BMC Bioinformatics. 10:48. Cancer J Clin. 47(1):5–27. 838 Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018 Human and Chimpanzee GBE Pond SL, Frost SD, Muse SV. 2005. HyPhy: hypothesis testing using phy- Tan G, Lenhard B. 2016. TFBSTools: an R/bioconductor package for tran- logenies. Bioinformatics 21(5):676–679. scription factor binding site analysis. Bioinformatics Puente XS, et al. 2006. Comparative analysis of cancer genes in the human 32(10):1555–1556. and chimpanzee genomes. BMC Genomics. 7:15. Tonnesen MG, Feng X, Clark RA. 2000. Angiogenesis in wound healing. Quinlan AR, Hall IM. 2010. BEDTools: a ﬂexible suite of utilities for com- J Investig Dermatol Symp Proc. 5(1):40–46. paring genomic features. Bioinformatics 26(6):841–842. Trapnell C, Pachter L, Salzberg SL. 2009. TopHat: discovering splice junc- Robinson MD, McCarthy DJ, Smyth GK. 2010. edgeR: a bioconductor tions with RNA-Seq. Bioinformatics 25(9):1105–1111. package for differential expression analysis of digital gene expression Varki NM, Varki A. 2015. On the apparent rarity of epithelial cancers in data. Bioinformatics 26(1):139–140. captive chimpanzees. Philos Trans R Soc B-Biol Sci. Sandelin A, Alkema W, Engstro ¨ m P, Wasserman WW, Lenhard B. 2004. 370(1673):20140225. JASPAR: an open-access database for eukaryotic transcription factor Whitﬁeld ML, et al. 2002. Identiﬁcation of genes periodically expressed in binding proﬁles. Nucleic Acids Res. 32(Database issue):D91–D94. the human cell cycle and their expression in tumors. Molec Biol Cell. Schmidt RE. 1975. Systemic pathology of chimpanzees. J Med Primatol. 13(6):1977–2000. 7(5):274–318. Williams GC. 1957. Pleiotropy, natural-selection, and the evolution of se- Scott GBD. 1992. Comparative primate pathology. Oxford, New York: nescence. Evolution 11(4):398–411. Oxford University Press. Wray GA, et al. 2003. The evolution of transcriptional regulation in eukar- Seibold HR, Wolf RH. 1973. Neoplasms and proliferative lesions in 1065 yotes. Mol Biol Evol. 20(9):1377–1419. nonhuman primate necropsies. Lab Anim Sci. 23(4):533–539. Wu S, Powers S, Zhu W, Hannun YA. 2016. Substantial contribution of Shibata Y, et al. 2012. Extensive evolutionary changes in regulatory ele- extrinsic risk factors to cancer development. Nature 529(7584):43–47. ment activity during human origins are associated with altered gene Yadav L, Puri N, Rastogi V, Satpute P, Sharma V. 2015. Tumour angio- expression and positive selection. PLoS Genet. 8(6):e1002789. genesis and angiogenic inhibitors: a review. J Clin Diagn Res. Song L, Crawford GE. 2010. DNase-seq: a high-resolution technique for 9(6):XE01–XE05. mapping active gene regulatory elements across the genome from Zerbino DR, et al. 2018. Ensembl 2018. Nucleic Acids Res. mammalian cells. Cold Spring Harb Protoc. 2010(2):pdb prot5384. 46(D1):D754–D761. Stearns SC, Medzhitov R. 2016. Evolutionary medicine. Sunderland (MA): Zhang Y, et al. 2008. Model-based analysis of ChIP-Seq (MACS). Genome Sinauer Associates, Inc., Publishers. Biol. 9(9):R137. Subramanian A, et al. 2005. Gene set enrichment analysis: a knowledge- Zhao S, Fung-Leung WP, Bittner A, Ngo K, Liu X. 2014. Comparison of based approach for interpreting genome-wide expression proﬁles. RNA-Seq and microarray in transcriptome proﬁling of activated T cells. Proc Natl Acad Sci U S A. 102(43):15545–15550. PLoS ONE. 9(1):e78644. t Hoen PA, et al. 2008. Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over ﬁve microarray platforms. Nucleic Acids Res. 36(21):e141. Associate editor: Naruya Saitou Genome Biol. Evol. 10(3):826–839 doi:10.1093/gbe/evy041 Advance Access publication March 5, 2018 839 Downloaded from https://academic.oup.com/gbe/article-abstract/10/3/826/4920862 by Ed 'DeepDyve' Gillespie user on 16 March 2018
Genome Biology and Evolution – Oxford University Press
Published: Mar 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
All the latest content is available, no embargo periods.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud