Evaluation of functionality for serine and threonine phosphorylation with different evolutionary ages in human and mouse

Evaluation of functionality for serine and threonine phosphorylation with different evolutionary... Background: Rapid evolution of phosphorylation sites could provide raw materials of natural selection to fit the environment by rewiring the regulation of signal pathways. However, a large part of phosphorylation sites was suggested to be non-functional. Although the new-arising phosphorylation sites with little functional implications prevailed in fungi, the evolutionary performance of vertebrate phosphorylation sites remained elusive. Results: In this study, we evaluated the functionality of human and mouse phosphorylation sites by dividing them into old, median and young age groups based on the phylogeny of vertebrates. We found the sites in the old group were more likely to be functional and involved in signaling pathways than those in the young group. A smaller proportion of sites in the young group originated from aspartate/glutamate, which could restore the ancestral functions. In addition, both the phosphorylation level and breadth was increased with the evolutionary age. Similar to cases in fungi, these results implied that the newly emerged phosphorylation sites in vertebrates were also more likely to be non-functional, especially for serine and threonine phosphorylation in disordered regions. Conclusions: This study provided not only insights into the dynamics of phosphorylation evolution in vertebrates, but also new clues to identify the functional phosphorylation sites from massive noisy data. Keywords: Phosphorylation, Evolution, Function Background species due to their structural preference in disordered Genetic variations are the primary sources that contrib- regions [11–15]. The high diversity of phosphosites ute to the evolution of new phenotypes [1–3]. With the would possibly rewire the cellular signaling transduction great advances in the high-throughput genome sequen- and response to environment, providing potential mate- cing, substantial genomic variations across different spe- rials that natural selection could act upon. Based on the cies were characterized [4–7]. However, how the observation that genetic interactions between kinases variations positively influence the phonotypic outcomes and substrates were altered at a higher rate than average remains largely unexplored. Changes of protein phos- genes among three species of yeast, it was postulated phorylation, which is the most ubiquitous post- that the evolution of phosphorylation regulation made translational modification conducting cellular signals, crucial contribution to phenotypic fitness just as tran- have been drawing many attentions [8–10]. Although scriptional regulation [2]. phosphorylation sites (phosphosites) are on average Nonetheless, another pervasive consequence of the more conserved than non-phosphorylated counterparts rapid turnover of phosphosites would be that a large in evolution, they exhibit rapid divergence among percentage of phosphosites are non-functional [16, 17]. Though there is little empirical evidence, it is possible in principle because of the limited specificity of kinases. * Correspondence: yxli@sibs.ac.cn; zwang01@sibs.ac.cn Key Lab of Computational Biology, CAS-MPG Partner Institute for This hypothesis was primarily supported by the fact that Computational Biology, Shanghai Institutes for Biological Sciences, Chinese a large proportion (about 65%) of phosphosites with no Academy of Sciences, Shanghai, People’s Republic of China characterized function evolved at a similar rate Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Miao et al. BMC Genomics (2018) 19:431 Page 2 of 9 compared with non-phosphosites in disordered regions across all the vertebrates and emerged earlier than [16, 17]. Moreover, the evolutionary conservation of 435 million years ago (MYA), whereas the phospho- phosphosites was positively associated with phosphoryl- sites in the young group were generated during the ation stoichiometry but negatively associated with pro- divergence of the mammal species (about 96 MYA) tein abundance in yeast, which could be explained by (Fig. 1a). The phosphosites with evolutionary age be- more accidental phosphorylation in abundant proteins tween the old and young group were defined to the [18]. A recent study presented a more comprehensive median group. landscape of phosphorylation across 18 fungal species The basic characters of phosphosites in the three age [19]. Through tracking the phylogeny of the phospho- groups were analyzed. For the three phosphor-acceptor sites, the study revealed that while only about 2% of the amino acids, the serine and threonine showed relatively phosphosites could be preserved longer than 700 million balanced distribution among the age groups, while the years, 69% were gained younger than 18 million years. tyrosine was much more enriched in the old group Especially, relatively to the recently acquired phospho- (Fig. 1b). As the difference in bio-chemical properties sites, the ancient ones are more likely to be functionally of the serine/threonine and tyrosine and that they important, which implies the enrichment of noisy phos- were phosphorylated by different types kinases, our phorylation in the former. results indicated their different evolutionary dynamics. In The prevalence of young phosphosites with silent each age group, there were more phosphosites located in functions was observed in fungi, but the evolutionary disordered regions than in the ordered regions, consistent pattern in more complex vertebrate species was elusive. with the knowledge that they were more likely to be recog- After all, the sophisticated development program and nized by kinases on the surface of protein [14, 15, 28, 29] communications among numerous cells in vertebrate (Fig. 1b). In the following analyses, we mainly focused on species strongly depend on signaling pathways mediated the phosphosites of serine/threonine in the disordered by phosphorylation [10, 20–24] . In fact, it was shown regions, from which most of the evolutionary diversity that phosphosites involved in vertebrate-specific func- resulted across the species . tional modules (VFMs) were even more conserved than those in basic functional modules (BFMs) [25]. In this Functional annotations of phosphosites in different study, with the large-scale data of human and mouse evolutionary age groups phosphosites, we aimed to evaluate the functionality of To evaluate the functional potential of phosphosites in phosphosites with different evolutionary ages across different age groups, we took several types of annota- vertebrates. tions into consideration. Firstly, we separately gathered human and mouse known-functional phosphosites from Results the PhosphoSitePlus database and polymorphic phos- Three evolutionary age groups of phosphosites phosites in populations from dbSNP [26, 30] (See Most of phosphorylation data on vertebrates were gener- Methods). We compared the proportion of the known- ated from model organisms, such as human and mouse functional and polymorphic phosphosites among the [15, 26]. Due to the unbalance of phosphorylation stud- three age groups, respectively. For both human and ies, it was infeasible to directly compare the phosphory- mouse, the proportion of known-functional phosphosites lated status across different species [19]. However, as is was significantly raised with increase of the evolutionary known that the appearance of phosphor-acceptor amino age (human: p-value < 0.01, mouse: p-value < 0.01, Chi- acids (serine, threonine or tyrosine) was essential for squared test), while the proportion of the polymorphic phosphorylation [8, 27], it was reasonably assumed that phosphosites was significantly decreased with the in- the later appearance time of those amino acids in evolu- crease of the evolutionary age (human: p-value < 0.01, tion would represent younger phosphorylated status. In mouse: p-value < 0.01, Chi-squared test, Fig. 2a). The this study, we separately collected 196,093 human and trends were also significantly observed for both human 62,274 mouse phosphosites from the PhosphoSitePlus tyrosine phosphosites and serine/threonine phosphosites database (version: 2015.12) [26], and 5747 phospho- in ordered regions (Additional file 1: Figure S2). These sites were shared by the two species (Additional file 1: results were well in agreement with the cases in yeasts, Figure S1). We reconstructed the ancestral sequences which suggested that functional phosphosites in verte- of the phosphorylation proteins across 8 vertebrates brates tended to be older in evolution. Meanwhile, as (see Methods). Based on the first appearance time of polymorphic sites were more likely to be evolutionarily the phosphor-acceptor amino acids in the vertebrate neutral, our results implicated that a higher proportion species tree, we divided the phosphosites into three of young phosphosites might be non-functional. different age groups: old, median and young (Fig. 1a). Next, due to the limited functional annotation of In the old group, the phosphosites kept conserved phosphosites, we took the proteins where the Miao et al. BMC Genomics (2018) 19:431 Page 3 of 9 Fig. 1 Three evolutionary age groups of phosphosites. (a) The age groups of phosphoites were divided in to the old, median and young based on the phylogeny of 8 vertebrate species. (b) The distributions of three kinds of phosphosites and local structures in the age groups. pS: phosphor-serine; pT: phosphor-threonine; pY: phosphor-tyrosine phosphosites resided into consideration. Wang et al. Enrichment analysis of ancestral state for the phosphosites dividedphosphorylatedproteinsintotwo broadmod- It has been reported that more phosphosites could ules: basic functional modules (BFMs) that shared by evolve from aspartate/glutamate (Asp/Glu) residues both vertebrates and non-vertebrates, and vertebrate- than corresponding non-phosphosites significantly specific functional modules (VFMs) [25]. In this [31, 32]. As the unpredictability of ancestral state for study, it was shown that the phosphosites in VFMs the old group conserved across all the vertebrates, showed higher evolutionary conversation than those we only considered phosphosites in the median and in BFMs, which was validated in both the human and young groups for the analysis. Following Pearlman et mouse (p-value = 1.50e-09 and p-value < 2.2e-016, sep- al. [31], we did the enrichment analysis of the ances- arately, Wilcox rank sum test, Additional file 1: tral state for the phosphosites, taking the corre- Figure S3A). In particular, there was a larger percent- sponding non-phosphorylated residues in the same age of young phosphosites distributed in the BFMs age groups as the control (see Methods). In both hu- compared with the VFMs (human: p-value = 0.035, man and mouse datasets, we observed significantly mouse: p-value = 9.17e-10, Chi-squared test, Fig. 2b). more phosphor-serine with evolutionary transition In addition, a smaller proportion of young phospho- from ancestral Asp/Glu residues in the disordered re- sites in the BFMs were known-functional compared gions (Fig. 3a). However, the enrichment was not sig- with those in the VFMs (human: p-value = 5.63e-07, nificant for phosphor-threonine, phosphor-tyrosine mouse: p-value = 1.87e-07, Fisher exact test, Add- and those in ordered regions, which probably due to itional file 1: Figure S3B). As the VFMs contained the small sample size (Additional file 1:Figures S4 many signaling pathways related proteins in which and S5). In addition to those results, we also discov- functional phosphosites were more likely to be in- ered that lysine (Lys), a positively charged amino volved, the results also implied that phosphosites in acid, was enriched in the transition to phosphor- the young group were less functionally important serine, probably due to the fact that phosphorylation than those in the old group. could also act upon Lys [33–35]. Miao et al. BMC Genomics (2018) 19:431 Page 4 of 9 Fig. 2 Functional annotations of phosphoites across age groups. (a) The known-functional and polymorphic sites in disordered regions were differently distributed among the three age groups for human and mouse. The functional phosphosites were more likely to be older and polymorphic sites were more likely to be younger (p-value< 0.01, Chi-squared test). (b) The proportion of age groups in disordered regions varied between BFMs and VFMs. In both human and mouse, young phosphosites occupied larger proportion in the BFMs than the VFMs (p-value = 0.035 and p-value< 0.01, separately, Chi-squared test). BFMs: basic functional modules; VFMs: vertebrate-specific functional modules Fig. 3 Analysis of ancestral state of phosphosites. a Enrichment analysis of amino acids in the transition to phosphosites in disordered regions for human and mouse. The labs of x-axis were the abbreviation of amino acids. Comparing with control data, three kinds of amino acids (D, E and K) were enriched in the transition to phosphosites in both human and mouse. b The distribution of phosphosites evolved from different amino acids in the median and young groups. In both human and mouse, there were more phosphosites evolving from D/E in the median group than the young group (p-value< 0.01, Chi-squared test). D: Aspartate; E: Glutamate; K: Lysine Miao et al. BMC Genomics (2018) 19:431 Page 5 of 9 Asp/Glu, the negatively charged amino acids, can Then, we compared the maximum phosphorylation level sometimes mimic the function of the phosphorylated and phosphorylation breadth (i.e., number of tissues residues with negative charge [31, 36, 37]. Thus, it was where the phosphorylation was expressed) among the assumed that the phosphorylation of the residues three age groups (see Methods). Meanwhile, as the evolved from Asp/Glu could activate the ancestral func- quantitative features of phosphosites might be con- tion of proteins. Consistent with the assumption, we founded by those of the whole protein, we excluded the found that there was a significant higher proportion of bias resulted from protein abundance and breadth. phosphosites transited from Asp/Glu distributed in the We identified that the maximum phosphorylation level median group than in the young group (human: p-value in the young group was lower compared with the old < 2.2e-16, mouse: p-value < 2.2e-16, Chi-squared test, group (p-value = 1.87e-10, Wilcox rank sum test, Fig. 4a) Fig. 3b). We also observed the similar results for the , but the protein abundance between the two groups phosphosites transited from Lys (human: p-value < 2.2e- showed no difference (p-value = 0.41, Wilcox rank sum 16, mouse: p-value < 2.2e-16, Chi-squared test). These test, Additional file 1: Figure S6A). We also found that provided another evidence that younger phosphosites of the phosphorylation breadth was lower in the young serine in disordered regions were more likely to be non- group than the old group (p-value = 7.52e-12, Wilcox functional. rank sum test, Fig. 4b). Although the protein expression breath showed difference (p-value < 2.2e-16, Wilcox Phosphorylation level and breath among tissues rank sum test, Additional file 1: Figure S6B), the As reported that in yeasts, phosphosites of higher stoi- ANONA analysis suggested that the contribution of the chiometry were more conserved and functional [18, 38]. phosphorylation breadth could not be compensated However, the relationship among quantitative features (p-value < 0.01) taking into account the two factors and evolutionary conservation was not systematically simultaneously. Furthermore, there was a larger pro- studied in the different tissues of vertebrates. In this portion of phosphosites with both low phosphoryl- study, we collected 14,954 mouse phosphosites with ation level and breadth in the young group compared phosphorylation levels across nine different tissues [39]. with the old group (p-value = 5.1e-04, Chi-squared a c Fig. 4 Phosphorylation level and breadth in mouse. (a) The distribution of maximum level for phosphosites in the three age groups. The phosphorylation level was lower in the young group than the old group (p-value = 1.87e-10, Wilcox rank sum test). (b) The phosphorylation breadth in the three age groups. Higher proportion of phosposites in the young group expressed in fewer tissues compared with the old group (p-value = 7.52e-12, Wilcox rank sum test). (c) The phosphorylation breadth of phosphopsites evolving from different amino acids. Larger part of phosphosites evolving from DE expressed in more tissues than other kind of phosphosites. (d) The maximum phosphorylation level for phosphosites originating from different amino acids. There were more phopshosites originating from DE with high phosphrylation level than other kinds of phosphosites Miao et al. BMC Genomics (2018) 19:431 Page 6 of 9 test, Additional file 1:FigureS7).And with thestep- Consistent with the study in fungi, our results pro- wise regression, the contribution of phosphorylation vided a comprehensive scenario for the evolution of level could be compensated by the breadth, indicating phosphorylation. The high turnover rate in disordered the high correlation between phosphorylation level regions facilitated the birth of new phosphosites. and breadth (r =0.46, Pearson’s correlation). These re- However, most of them would be non-functional and sults indicated that with the increase of evolutionary eliminated due to the lack of functional constrains. age, both the phosphorylation level and breadth The rapid ‘try-and-error’ process provided potential tended to be increased, which could be more likely to materials for the innovation of phenotypic fitness, and keep essential functions. only those phosphosites acquiring indispensable func- It was interesting that phosphosites evolved from Asp/ tions could be retained during the long evolutionary Glu displayed higher phosphorylation level and breadth time. The scenario provided a new approach to than those originated from other amino acids (p-value = screen out the functional ones from massive phospho- 1.89e-08 and p-value = 1.12e-05, separately, Wilcox rank site data by a combination of several factors, includ- sum test, Fig. 4C and D). This supported our argument ing evolutionary age, ancestral state, phosphorylation that this type of phosphosites was more intended to be level and breadth. functional. However, this trend was not observed for the It is important to note that the insights of tyrosine phosphosites originated from Lys (p-value = 0.31 and p- phosphorylation and the phosphorylation in ordered value = 0.24, separately, Wilcox rank sum test). Thus, regions were limited in our study. This was because quantitative features would be also helpful to prioritize that the evolutionary rates of the two types were functional phosphosites. much lower than the serine and threonine phosphor- ylation in disordered regions, leading that only a Discussion small amount of data was useful. However, it was also As with genetic variants, rapid changes of protein phos- observed that for both tyrosine phosphosites and phorylation would play important roles in the evolution serine/threonine phosphosites in ordered regions, of new phenotypes. However, a large part of phospho- fewer known-functional and more polymorphic phos- sites was speculated to be non-functional due to the phosites were distributed in the young group, indicat- non-specific recognition of kinases [16, 17]. Computa- ing that they were subjected to the similar pattern. tional methods integrating evolutionary conservation, Another possible concern was that we identified the structural preference and kinase recognition were pro- ages based on the phylogeny of phospho-acceptor, posed to explore functional phosphosites from large- which was an upper bound for the emergence of scale data [40–42]. And a recent fungi study showed that phosphorylation status. For example, it was possible young phosphosites prevailed and contributed to the that some phosphosites in the old group might be major noisy of phosphorylation [19]. In this study, we in- phosphorylated in recent time. Therefore more com- vestigated whether the evolutionary age of phosphosites prehensive phosphorylation datasets covering different was also associated with their functionality in vertebrates species were necessary to infer the phosphorylation by dividing the phosphosites into old, median and young agemoreprecisely. groups based on their emergence time in the species tree. We found several evidences that the functional poten- Conclusions tial of phosphosites was increased with their evolution- In summary, the current study evaluated the func- ary age especially for serine and threonine in the tionality of human and mouse phosphosites, indicat- disordered regions, which was separately proved in the ing that newly emerged vertebrate phosphosites were human and mouse data. Firstly, the old group harbored more likely to be non-functional, especially for serine a higher proportion of known-functional phosphosites and threonine phosphorylation in disordered regions. and a lower proportion of polymorphic phosphosites Meanwhile, our study provided a new evolutionary than the young group. Secondly, in VFMs where phos- scenario of phosphorylation. New phosphosites were phorylation was more essential, there was a lower frac- caused by the high turnover rate in disordered re- tion of young phosphosites than in BFMs. Thirdly, fewer gions, and only those with indispensable functions phosphosites originating from Asp/Glu which could could result in the innovation of phenotypic fitness mimic their ancestral functions, were evolutionarily during the long evolutionary time. We also provided young. Finally, the phosphorylation level and breadth useful insights to screen out the functional ones from among different tissues were increased with evolutionary massive phosphosite data by a combination of the age. And the phosphorylation level is highly correlated evolutionary age, ancestral state, phosphorylation level with the breadth. and breadth. Miao et al. BMC Genomics (2018) 19:431 Page 7 of 9 Methods control dataset was filtered by the following criterion: 1) Classification of Phosphosites the sites overlapped with the verified and predicted One hundred ninety-six thousand ninety-three human phosphosites from dbPTM were excluded; 2) we elimi- phosphosites of 16,031 proteins and 62,274 mouse phos- nated the potential phosphosites predicted by Networkin phosites of 9600 proteins were gathered from the Phos- with the score greater than 2 were eliminated. According phoSitePlus database (version: v2015.12) separately [26]. to the method of Pearlman et al., we calculated the ratio To classify the phosphosites by the evolutionary ages, 8 and p-value of enrichment for each ancestral amino acid vertebrate species were selected, including H. sapiens, P. by bootstrap sampling with 1000 times. For every sam- troglodytes, M. musculus, R. norvergicus, B. taurus, G. pling, the distribution of phosphosites in disordered re- gallus, X. tropicalis and D. rerio. We built the species gions was balanced between phosphosites and control tree and got the divergence time from TimeTree.org datasets. [43]. The orthologous proteins of these species were found from InParanoid8, and multiple sequence align- Phosphorylation quantification ments were performed with Clustal-omega [44, 45]. Fourteen thousand nine hundred fifty-four mouse phos- Then, the FastML.v3.1 software was used to reconstruct phosites with quantification data were gathered in nine the ancestral sequences of phosphorylation proteins [46, tissues [39]. Overlapping with the age groups in disor- 47]. The phophosites were filtered by the following rules: dered regions, we respectively got 2947, 2359 and 2310 the number of orthologous sequences should be more sites in the old, median and young groups. Among the than 4 to infer the ancestral state. Finally, we got 115,780 age groups, we respectively compared the maximum human phosphosites and 42,244 mouse phosphosites re- phosphorylation level across the tissue and the number spectively (Additional file 1: Table S1). of tissues where the phosphosites were expressed. In order to exclude the influence of protein abundance and Local structure breadth, the ANOVA analysis was performed ting both The VSL2 software was used to identify the phosphosites the phosphorylation and protein quantification into in ordered or disordered regions [48]. It calculated the account. disorder score (between 0 and 1) for each amino acid based on 26 amino acids-based features for a given pro- tein sequence. The score of phosphosites in disordered Additional files regions should be greater than 5. Otherwise, the phos- phosites were in ordered regions. Additional file 1: Figure S1. The shared phosphosites between human and mouse. Figure S2. The functional annotations for human tyrosine phosphosites and serine/threonine phosphosites in ordered regions. Functional annotation Figure S3. The difference of conservation and functional annotations The human and mouse functional phosphosites were between BFMs and VFMs. Figure S4. Enrichment analysis of amino acids collected from a file named ‘Regulatory_sites.gz’ in the in the transition to phosphosites for human. Figure S5. Enrichment analysis of amino acids in the transition to phosphosites for mouse. PhosphoSitePlus database (file name: Regulatory_sites, Figure S6. The distribution of protein abundance and breath in mouse. version: v2015.12) [26]. We respectively got human and Figure S7. The phosphosites with both low phosphorylation level and mouse polymorphic phosphosites by overlapping with breadth significantly enriched in young group compared with old group. Table S1. The number distribution of human and mouse phosphosites. human and mouse SNPs annotated from dbSNP138 (DOCX 226 kb) using ANNOVAR (version: 20150619, GRCH38 and GRCM38) [49]. The definition of BFMs and VFMs for phosphorylation proteins refered to Wang et al. [25]. Abbreviations And Rate4sites [50] was used to calculate the evolution- phosphosites: phosphorylation sites; VFMs: vertebrate-specific functional modules; BFMs: basic functional modules; MYA: million years ago; ary rate of phosphosites and compared the rates between Asp: aspartate; Glu: glutamate; Lys: lysine. BFMs and VFMs. Acknowledgements Enrichment of ancestral state The authors thank the members of the YL lab for helpful discussion. We explored whether phosphosites in the median and young group could significantly originate from some Funding amino acid residues compared with corresponding non- This work was supported by the National Key R&D Program of China phosphosites. The ancestral state of those sites was re- (2017YFA0505500, 2016YFC0901704), the National Natural Science trieved from the FastML.v3.1 results. For the control Foundation of China (31301032), the Youth Innovation Promotion Association CAS (2017325) and the Shanghai Postdoctoral Scientific Program dataset, we collected serine, threonine and tyrosine sites (13R21417300). The funding bodies had no role in the design of the study which were not known to be phosphorylated from the and collection, analysis and interpretation of data and in writing the same proteins within the age groups. We filtered the manuscript. Miao et al. BMC Genomics (2018) 19:431 Page 8 of 9 Availability of data and materials 14. Jiménez JL, Hegemann B, Hutchins JR, Peters JM, Durbin R. A systematic The phosphorylation datasets analyzed during the current study are comparative and structural analysis of protein phosphorylation sites based available in the PhosphoSitePlus database (https://www.phosphosite.org/ on the mtcPTM database. Genome Biol. 2007;8(5):1–20. homeAction.action)[26]. 15. Gnad F, Ren S, Cox J, Olsen JV, Macek B, Oroshi M, Mann M. PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites. Genome Biol. 2007;8(11):R250. Authors’ contributions 16. Lienhard GE. Non-functional phosphorylations? Trends Biochem Sci. 2008; BM was responsible for evolutionary analysis of the phosphorylation data. QX 33(8):351. carried out the polymorphic phosphorylation sites. WC carried out the data 17. Landry CR, Levy ED, Michnick SW. Weak functional constraints on collection and interpretation. ZW conceived this research and drafted the phosphoproteomes. Trends in Genetics Tig. 2009;25(5):193. manuscript. YL conceived the research and revised the paper. All authors 18. Tan CS, Bader GD. Phosphorylation sites of higher stoichiometry are more read and approved the final manuscript. conserved. Nat Methods. 2012;9(4):317. author reply 318 19. Romain A, RAR-M S, Haas KM, Hsu JI, Viéitez C, Solé C, Swaney DL, Stanford Ethics approval and consent to participate LB, Liachko I, Böttcher R, Dunham MJ, de Nadal E, Posas F, Beltrao P, Villén J. Not applicable. Evolution of protein phosphorylation across 18 fungal species. Science. 2016;354(6309):229–32. Competing interests 20. Ubersax JA, Jr FJ. Mechanisms of specificity in protein phosphorylation. Nat The authors declare that they have no competing interests. Rev Mol Cell Biol. 2007;8(7):530–41. 21. Pawson T, Scott JD. Protein phosphorylation in signaling–50 years and counting. Trends Biochem Sci. 2005;30(6):286. Publisher’sNote 22. Karin M. Signal transduction from the cell surface to the nucleus through Springer Nature remains neutral with regard to jurisdictional claims in the phosphorylation of transcription factors. Curr Opin Cell Biol. 1994;6(3): published maps and institutional affiliations. 23. Johnson LN, Barford D. The effects of phosphorylation on the structure and Author details function of proteins. Annual Review of Biophysics & Biomolecular Structure. Key Lab of Computational Biology, CAS-MPG Partner Institute for 1993;22(22):199. Computational Biology, Shanghai Institutes for Biological Sciences, Chinese 24. Hardie DG. AMP-activated/SNF1 protein kinases: conserved guardians of Academy of Sciences, Shanghai, People’s Republic of China. University of cellular energy. Nat Rev Mol Cell Biol. 2007;8(10):774–85. Chinese Academy of Sciences, Beijing, People’s Republic of China. School of 25. Wang Z, Ding G, Geistlinger L, Li H, Liu L, Zeng R, Tateno Y, Li Y. Evolution Life Science and Technology, Tongji University, Shanghai, People’s Republic of protein phosphorylation for distinct functional modules in vertebrate of China. Shanghai Center for Bioinformation Technology, Shanghai genomes. Mol Biol Evol. 2011;28(3):1131–40. Industrial Technology Institute, Shanghai, People’s Republic of China. 26. Hornbeck PV, Kornhauser JM, Tkachev S, Zhang B, Skrzypek E, Murray B, Collaborative Innovation Center for Genetics and Development, Fudan Latham V, Sullivan M. PhosphoSitePlus: a comprehensive resource for University, Shanghai, People’s Republic of China. investigating the structure and function of experimentally determined post- translational modifications in man and mouse. Nucleic Acids Res. 2012;40: Received: 22 December 2017 Accepted: 12 April 2018 Database issue):261–70. 27. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S. The protein kinase complement of the human genome. Science. 2002;298(5600):1912. References 28. Iakoucheva LM, Radivojac P, Brown CJ, O'Connor TR, Sikes JG, Obradovic Z, 1. Wapinski I, Pfeffer A, Friedman N, Regev A. Natural history and evolutionary Dunker AK. The importance of intrinsic disorder for protein phosphorylation. principles of gene duplication in fungi. Nature. 2007;449(7158):54–61. Nucleic Acids Res. 2004;32(3):1037. 2. Beltrao P, Trinidad JC, Fiedler D, Roguev A, Lim WA, Shokat KM, Burlingame 29. Davis FP. Phosphorylation at the Interface. Structure. 2011;19(12):1726. AL, Krogan NJ. Evolution of phosphoregulation: comparison of 30. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. phosphorylation patterns across yeast species. PLoS Biol. 2009;7(6):e1000134. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2000; 3. Landry CR, Lemos B, Rifkin SA, Dickinson WJ, Hartl DL. Genetic properties 29(1):308. influencing the evolvability of gene expression. Science. 2007;317(5834):118. 31. Pearlman SM, Serber Z, Ferrell JE Jr. A mechanism for the evolution of 4. Soon WW. Hariharan M. High-throughput sequencing for biology and phosphorylation sites. Cell. 2011;147(4):934–46. medicine: Snyder MP; 2013. 32. Kurmangaliyev YZ, Goland A, Gelfand MS. Evolutionary patterns of 5. Kircher M, Kelso J. High-throughput DNA sequencing–concepts and phosphorylated serines. Biol Direct. 2011;6(1):1–7. limitations. Bioessays News & Reviews in Molecular Cellular & 33. Besant P, Attwood P, Piggott M. Focus on Phosphoarginine and Developmental Biology. 2010;32(6):524–36. Phospholysine. Curr Protein Pept Sci. 2009;10(6):536–50. 6. Purcell S, Neale B, Toddbrown K, Thomas L, Ferreira MA, Bender D, Maller J, 34. Bertran-Vicente J, Serwa RA, Schumann M, Schmieder P, Krause E, Sklar P, de Bakker PI, Daly MJ. PLINK: a tool set for whole-genome Hackenberger CP. Site-specifically phosphorylated lysine peptides. J Am association and population-based linkage analyses. Am J Hum Genet. 2007; Chem Soc. 2014;136(39):13622–8. 81(3):559–75. 35. Cieśla J, Frączyk T, Rode W. Phosphorylation of basic amino acid residues in 7. Durbin R: 1000 genome project: a map of human genome variation from proteins: important but easily missed. Acta Biochim Pol. 2011;58(2):137. population scale sequencing. 2010. 36. Thorsness PE, Koshland DE. Inactivation of isocitrate dehydrogenase by 8. Cohen P. The origins of protein phosphorylation. Nat Cell Biol. 2002;4(5): phosphorylation is mediated by the negative charge of the phosphate. J E127. Biol Chem. 1987;262(22):10422–5. 9. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, Mann M. 37. Cooper JA, Sefton BM, Hunter T: [42] Detection and quantification of Global, in vivo, and site-specific phosphorylation dynamics in signaling phosphotyrosine in proteins. Methods Enzymol 1983, 99(99):387. networks. Cell. 2006;127(3):635. 10. Johnson LN. The regulation of protein phosphorylation. Biochem Soc Trans. 38. Levy ED, Michnick SW, Landry CR. Protein abundance is key to distinguish 2009;37(4):627–41. promiscuous from functional phosphorylation based on evolutionary information. Philos Trans R Soc Lond Ser B Biol Sci. 2012;367(1602):2594– 11. Malik R, Nigg EA, Körner R. Comparative conservation analysis of the human mitotic phosphoproteome. Bioinformatics. 2008;24(12):1426. 12. Chen SC, Chen FC, Li WH. Phosphorylated and nonphosphorylated serine 39. Huttlin EL, Jedrychowski MP, Elias JE, Goswami T, Rad R, Beausoleil SA, Villén and threonine residues evolve at different rates in mammals. Molecular J, Haas W, Sowa ME, Gygi SP. A tissue-specific atlas of mouse protein Biology & Evolution. 2010;27(11):2548–54. phosphorylation and expression. Cell. 2010;143(7):1174. 13. Boekhorst J, Breukelen BV, Heck AJ, Snel B. Comparative 40. Xiao Q, Miao B, Jie B, Zhen W, Li Y. Corrigendum: prioritizing functional phosphoproteomics reveals evolutionary and functional conservation of phosphorylation sites based on multiple feature integration. Sci Rep. phosphorylation across eukaryotes. Genome Biol. 2008;9(10):R144. 2016;6:24735. Miao et al. BMC Genomics (2018) 19:431 Page 9 of 9 41. Shen N, Wang Z, Ge D, Zhang G, Li Y. Prediction of functional phosphorylation sites by incorporating evolutionary information. Protein & Cell. 2012;3(9):675–90. 42. Beltrao P, Albanèse V, Kenner LR, Swaney DL, Burlingame A, Villén J, Lim WA, Fraser JS, Frydman J, Krogan NJ. Systematic functional prioritization of protein posttranslational modifications. Cell. 2012;150(2):413. 43. Hedges SB, Marin J, Suleski M, Paymer M, Kumar S. Tree of life reveals clock-like speciation and diversification. Molecular Biology & Evolution. 2015;32(4):835–45. 44. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, Mcwilliam H, Remmert M, Söding J. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal omega. Mol Syst Biol. 2011;7(7):539. 45. Sonnhammer EL, G Ö: InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic. Nucleic Acids Res 2015, 43(Database issue): 234–239. 46. Pupko T, Pe'Er I, Shamir R, Graur D. A fast algorithm for joint reconstruction of ancestral amino acid sequences. Molecular Biology & Evolution. 2000; 17(6):890. 47. Haim A, Osnat P, Adi DF, Ofir C, Gina C, Oren Z, Tal P. FastML: a web server for probabilistic reconstruction of ancestral sequences. Nucleic Acids Res. 2012;40:Web Server issue):580–4. 48. Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z. Length-dependent prediction of protein intrinsic disorder. Bmc Bioinformatics. 2006;7(1):1–17. 49. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010; 38(16):e164. 50. Tal Pupko REB, Mayrose I, Glaser F, Ben-Tal N. Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics. 2002; 18:S71–7. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png BMC Genomics Springer Journals

Evaluation of functionality for serine and threonine phosphorylation with different evolutionary ages in human and mouse

Free
9 pages
Loading next page...
 
/lp/springer_journal/evaluation-of-functionality-for-serine-and-threonine-phosphorylation-Bd3HKcgRz0
Publisher
BioMed Central
Copyright
Copyright © 2018 by The Author(s).
Subject
Life Sciences; Life Sciences, general; Microarrays; Proteomics; Animal Genetics and Genomics; Microbial Genetics and Genomics; Plant Genetics and Genomics
eISSN
1471-2164
D.O.I.
10.1186/s12864-018-4661-6
Publisher site
See Article on Publisher Site

Abstract

Background: Rapid evolution of phosphorylation sites could provide raw materials of natural selection to fit the environment by rewiring the regulation of signal pathways. However, a large part of phosphorylation sites was suggested to be non-functional. Although the new-arising phosphorylation sites with little functional implications prevailed in fungi, the evolutionary performance of vertebrate phosphorylation sites remained elusive. Results: In this study, we evaluated the functionality of human and mouse phosphorylation sites by dividing them into old, median and young age groups based on the phylogeny of vertebrates. We found the sites in the old group were more likely to be functional and involved in signaling pathways than those in the young group. A smaller proportion of sites in the young group originated from aspartate/glutamate, which could restore the ancestral functions. In addition, both the phosphorylation level and breadth was increased with the evolutionary age. Similar to cases in fungi, these results implied that the newly emerged phosphorylation sites in vertebrates were also more likely to be non-functional, especially for serine and threonine phosphorylation in disordered regions. Conclusions: This study provided not only insights into the dynamics of phosphorylation evolution in vertebrates, but also new clues to identify the functional phosphorylation sites from massive noisy data. Keywords: Phosphorylation, Evolution, Function Background species due to their structural preference in disordered Genetic variations are the primary sources that contrib- regions [11–15]. The high diversity of phosphosites ute to the evolution of new phenotypes [1–3]. With the would possibly rewire the cellular signaling transduction great advances in the high-throughput genome sequen- and response to environment, providing potential mate- cing, substantial genomic variations across different spe- rials that natural selection could act upon. Based on the cies were characterized [4–7]. However, how the observation that genetic interactions between kinases variations positively influence the phonotypic outcomes and substrates were altered at a higher rate than average remains largely unexplored. Changes of protein phos- genes among three species of yeast, it was postulated phorylation, which is the most ubiquitous post- that the evolution of phosphorylation regulation made translational modification conducting cellular signals, crucial contribution to phenotypic fitness just as tran- have been drawing many attentions [8–10]. Although scriptional regulation [2]. phosphorylation sites (phosphosites) are on average Nonetheless, another pervasive consequence of the more conserved than non-phosphorylated counterparts rapid turnover of phosphosites would be that a large in evolution, they exhibit rapid divergence among percentage of phosphosites are non-functional [16, 17]. Though there is little empirical evidence, it is possible in principle because of the limited specificity of kinases. * Correspondence: yxli@sibs.ac.cn; zwang01@sibs.ac.cn Key Lab of Computational Biology, CAS-MPG Partner Institute for This hypothesis was primarily supported by the fact that Computational Biology, Shanghai Institutes for Biological Sciences, Chinese a large proportion (about 65%) of phosphosites with no Academy of Sciences, Shanghai, People’s Republic of China characterized function evolved at a similar rate Full list of author information is available at the end of the article © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Miao et al. BMC Genomics (2018) 19:431 Page 2 of 9 compared with non-phosphosites in disordered regions across all the vertebrates and emerged earlier than [16, 17]. Moreover, the evolutionary conservation of 435 million years ago (MYA), whereas the phospho- phosphosites was positively associated with phosphoryl- sites in the young group were generated during the ation stoichiometry but negatively associated with pro- divergence of the mammal species (about 96 MYA) tein abundance in yeast, which could be explained by (Fig. 1a). The phosphosites with evolutionary age be- more accidental phosphorylation in abundant proteins tween the old and young group were defined to the [18]. A recent study presented a more comprehensive median group. landscape of phosphorylation across 18 fungal species The basic characters of phosphosites in the three age [19]. Through tracking the phylogeny of the phospho- groups were analyzed. For the three phosphor-acceptor sites, the study revealed that while only about 2% of the amino acids, the serine and threonine showed relatively phosphosites could be preserved longer than 700 million balanced distribution among the age groups, while the years, 69% were gained younger than 18 million years. tyrosine was much more enriched in the old group Especially, relatively to the recently acquired phospho- (Fig. 1b). As the difference in bio-chemical properties sites, the ancient ones are more likely to be functionally of the serine/threonine and tyrosine and that they important, which implies the enrichment of noisy phos- were phosphorylated by different types kinases, our phorylation in the former. results indicated their different evolutionary dynamics. In The prevalence of young phosphosites with silent each age group, there were more phosphosites located in functions was observed in fungi, but the evolutionary disordered regions than in the ordered regions, consistent pattern in more complex vertebrate species was elusive. with the knowledge that they were more likely to be recog- After all, the sophisticated development program and nized by kinases on the surface of protein [14, 15, 28, 29] communications among numerous cells in vertebrate (Fig. 1b). In the following analyses, we mainly focused on species strongly depend on signaling pathways mediated the phosphosites of serine/threonine in the disordered by phosphorylation [10, 20–24] . In fact, it was shown regions, from which most of the evolutionary diversity that phosphosites involved in vertebrate-specific func- resulted across the species . tional modules (VFMs) were even more conserved than those in basic functional modules (BFMs) [25]. In this Functional annotations of phosphosites in different study, with the large-scale data of human and mouse evolutionary age groups phosphosites, we aimed to evaluate the functionality of To evaluate the functional potential of phosphosites in phosphosites with different evolutionary ages across different age groups, we took several types of annota- vertebrates. tions into consideration. Firstly, we separately gathered human and mouse known-functional phosphosites from Results the PhosphoSitePlus database and polymorphic phos- Three evolutionary age groups of phosphosites phosites in populations from dbSNP [26, 30] (See Most of phosphorylation data on vertebrates were gener- Methods). We compared the proportion of the known- ated from model organisms, such as human and mouse functional and polymorphic phosphosites among the [15, 26]. Due to the unbalance of phosphorylation stud- three age groups, respectively. For both human and ies, it was infeasible to directly compare the phosphory- mouse, the proportion of known-functional phosphosites lated status across different species [19]. However, as is was significantly raised with increase of the evolutionary known that the appearance of phosphor-acceptor amino age (human: p-value < 0.01, mouse: p-value < 0.01, Chi- acids (serine, threonine or tyrosine) was essential for squared test), while the proportion of the polymorphic phosphorylation [8, 27], it was reasonably assumed that phosphosites was significantly decreased with the in- the later appearance time of those amino acids in evolu- crease of the evolutionary age (human: p-value < 0.01, tion would represent younger phosphorylated status. In mouse: p-value < 0.01, Chi-squared test, Fig. 2a). The this study, we separately collected 196,093 human and trends were also significantly observed for both human 62,274 mouse phosphosites from the PhosphoSitePlus tyrosine phosphosites and serine/threonine phosphosites database (version: 2015.12) [26], and 5747 phospho- in ordered regions (Additional file 1: Figure S2). These sites were shared by the two species (Additional file 1: results were well in agreement with the cases in yeasts, Figure S1). We reconstructed the ancestral sequences which suggested that functional phosphosites in verte- of the phosphorylation proteins across 8 vertebrates brates tended to be older in evolution. Meanwhile, as (see Methods). Based on the first appearance time of polymorphic sites were more likely to be evolutionarily the phosphor-acceptor amino acids in the vertebrate neutral, our results implicated that a higher proportion species tree, we divided the phosphosites into three of young phosphosites might be non-functional. different age groups: old, median and young (Fig. 1a). Next, due to the limited functional annotation of In the old group, the phosphosites kept conserved phosphosites, we took the proteins where the Miao et al. BMC Genomics (2018) 19:431 Page 3 of 9 Fig. 1 Three evolutionary age groups of phosphosites. (a) The age groups of phosphoites were divided in to the old, median and young based on the phylogeny of 8 vertebrate species. (b) The distributions of three kinds of phosphosites and local structures in the age groups. pS: phosphor-serine; pT: phosphor-threonine; pY: phosphor-tyrosine phosphosites resided into consideration. Wang et al. Enrichment analysis of ancestral state for the phosphosites dividedphosphorylatedproteinsintotwo broadmod- It has been reported that more phosphosites could ules: basic functional modules (BFMs) that shared by evolve from aspartate/glutamate (Asp/Glu) residues both vertebrates and non-vertebrates, and vertebrate- than corresponding non-phosphosites significantly specific functional modules (VFMs) [25]. In this [31, 32]. As the unpredictability of ancestral state for study, it was shown that the phosphosites in VFMs the old group conserved across all the vertebrates, showed higher evolutionary conversation than those we only considered phosphosites in the median and in BFMs, which was validated in both the human and young groups for the analysis. Following Pearlman et mouse (p-value = 1.50e-09 and p-value < 2.2e-016, sep- al. [31], we did the enrichment analysis of the ances- arately, Wilcox rank sum test, Additional file 1: tral state for the phosphosites, taking the corre- Figure S3A). In particular, there was a larger percent- sponding non-phosphorylated residues in the same age of young phosphosites distributed in the BFMs age groups as the control (see Methods). In both hu- compared with the VFMs (human: p-value = 0.035, man and mouse datasets, we observed significantly mouse: p-value = 9.17e-10, Chi-squared test, Fig. 2b). more phosphor-serine with evolutionary transition In addition, a smaller proportion of young phospho- from ancestral Asp/Glu residues in the disordered re- sites in the BFMs were known-functional compared gions (Fig. 3a). However, the enrichment was not sig- with those in the VFMs (human: p-value = 5.63e-07, nificant for phosphor-threonine, phosphor-tyrosine mouse: p-value = 1.87e-07, Fisher exact test, Add- and those in ordered regions, which probably due to itional file 1: Figure S3B). As the VFMs contained the small sample size (Additional file 1:Figures S4 many signaling pathways related proteins in which and S5). In addition to those results, we also discov- functional phosphosites were more likely to be in- ered that lysine (Lys), a positively charged amino volved, the results also implied that phosphosites in acid, was enriched in the transition to phosphor- the young group were less functionally important serine, probably due to the fact that phosphorylation than those in the old group. could also act upon Lys [33–35]. Miao et al. BMC Genomics (2018) 19:431 Page 4 of 9 Fig. 2 Functional annotations of phosphoites across age groups. (a) The known-functional and polymorphic sites in disordered regions were differently distributed among the three age groups for human and mouse. The functional phosphosites were more likely to be older and polymorphic sites were more likely to be younger (p-value< 0.01, Chi-squared test). (b) The proportion of age groups in disordered regions varied between BFMs and VFMs. In both human and mouse, young phosphosites occupied larger proportion in the BFMs than the VFMs (p-value = 0.035 and p-value< 0.01, separately, Chi-squared test). BFMs: basic functional modules; VFMs: vertebrate-specific functional modules Fig. 3 Analysis of ancestral state of phosphosites. a Enrichment analysis of amino acids in the transition to phosphosites in disordered regions for human and mouse. The labs of x-axis were the abbreviation of amino acids. Comparing with control data, three kinds of amino acids (D, E and K) were enriched in the transition to phosphosites in both human and mouse. b The distribution of phosphosites evolved from different amino acids in the median and young groups. In both human and mouse, there were more phosphosites evolving from D/E in the median group than the young group (p-value< 0.01, Chi-squared test). D: Aspartate; E: Glutamate; K: Lysine Miao et al. BMC Genomics (2018) 19:431 Page 5 of 9 Asp/Glu, the negatively charged amino acids, can Then, we compared the maximum phosphorylation level sometimes mimic the function of the phosphorylated and phosphorylation breadth (i.e., number of tissues residues with negative charge [31, 36, 37]. Thus, it was where the phosphorylation was expressed) among the assumed that the phosphorylation of the residues three age groups (see Methods). Meanwhile, as the evolved from Asp/Glu could activate the ancestral func- quantitative features of phosphosites might be con- tion of proteins. Consistent with the assumption, we founded by those of the whole protein, we excluded the found that there was a significant higher proportion of bias resulted from protein abundance and breadth. phosphosites transited from Asp/Glu distributed in the We identified that the maximum phosphorylation level median group than in the young group (human: p-value in the young group was lower compared with the old < 2.2e-16, mouse: p-value < 2.2e-16, Chi-squared test, group (p-value = 1.87e-10, Wilcox rank sum test, Fig. 4a) Fig. 3b). We also observed the similar results for the , but the protein abundance between the two groups phosphosites transited from Lys (human: p-value < 2.2e- showed no difference (p-value = 0.41, Wilcox rank sum 16, mouse: p-value < 2.2e-16, Chi-squared test). These test, Additional file 1: Figure S6A). We also found that provided another evidence that younger phosphosites of the phosphorylation breadth was lower in the young serine in disordered regions were more likely to be non- group than the old group (p-value = 7.52e-12, Wilcox functional. rank sum test, Fig. 4b). Although the protein expression breath showed difference (p-value < 2.2e-16, Wilcox Phosphorylation level and breath among tissues rank sum test, Additional file 1: Figure S6B), the As reported that in yeasts, phosphosites of higher stoi- ANONA analysis suggested that the contribution of the chiometry were more conserved and functional [18, 38]. phosphorylation breadth could not be compensated However, the relationship among quantitative features (p-value < 0.01) taking into account the two factors and evolutionary conservation was not systematically simultaneously. Furthermore, there was a larger pro- studied in the different tissues of vertebrates. In this portion of phosphosites with both low phosphoryl- study, we collected 14,954 mouse phosphosites with ation level and breadth in the young group compared phosphorylation levels across nine different tissues [39]. with the old group (p-value = 5.1e-04, Chi-squared a c Fig. 4 Phosphorylation level and breadth in mouse. (a) The distribution of maximum level for phosphosites in the three age groups. The phosphorylation level was lower in the young group than the old group (p-value = 1.87e-10, Wilcox rank sum test). (b) The phosphorylation breadth in the three age groups. Higher proportion of phosposites in the young group expressed in fewer tissues compared with the old group (p-value = 7.52e-12, Wilcox rank sum test). (c) The phosphorylation breadth of phosphopsites evolving from different amino acids. Larger part of phosphosites evolving from DE expressed in more tissues than other kind of phosphosites. (d) The maximum phosphorylation level for phosphosites originating from different amino acids. There were more phopshosites originating from DE with high phosphrylation level than other kinds of phosphosites Miao et al. BMC Genomics (2018) 19:431 Page 6 of 9 test, Additional file 1:FigureS7).And with thestep- Consistent with the study in fungi, our results pro- wise regression, the contribution of phosphorylation vided a comprehensive scenario for the evolution of level could be compensated by the breadth, indicating phosphorylation. The high turnover rate in disordered the high correlation between phosphorylation level regions facilitated the birth of new phosphosites. and breadth (r =0.46, Pearson’s correlation). These re- However, most of them would be non-functional and sults indicated that with the increase of evolutionary eliminated due to the lack of functional constrains. age, both the phosphorylation level and breadth The rapid ‘try-and-error’ process provided potential tended to be increased, which could be more likely to materials for the innovation of phenotypic fitness, and keep essential functions. only those phosphosites acquiring indispensable func- It was interesting that phosphosites evolved from Asp/ tions could be retained during the long evolutionary Glu displayed higher phosphorylation level and breadth time. The scenario provided a new approach to than those originated from other amino acids (p-value = screen out the functional ones from massive phospho- 1.89e-08 and p-value = 1.12e-05, separately, Wilcox rank site data by a combination of several factors, includ- sum test, Fig. 4C and D). This supported our argument ing evolutionary age, ancestral state, phosphorylation that this type of phosphosites was more intended to be level and breadth. functional. However, this trend was not observed for the It is important to note that the insights of tyrosine phosphosites originated from Lys (p-value = 0.31 and p- phosphorylation and the phosphorylation in ordered value = 0.24, separately, Wilcox rank sum test). Thus, regions were limited in our study. This was because quantitative features would be also helpful to prioritize that the evolutionary rates of the two types were functional phosphosites. much lower than the serine and threonine phosphor- ylation in disordered regions, leading that only a Discussion small amount of data was useful. However, it was also As with genetic variants, rapid changes of protein phos- observed that for both tyrosine phosphosites and phorylation would play important roles in the evolution serine/threonine phosphosites in ordered regions, of new phenotypes. However, a large part of phospho- fewer known-functional and more polymorphic phos- sites was speculated to be non-functional due to the phosites were distributed in the young group, indicat- non-specific recognition of kinases [16, 17]. Computa- ing that they were subjected to the similar pattern. tional methods integrating evolutionary conservation, Another possible concern was that we identified the structural preference and kinase recognition were pro- ages based on the phylogeny of phospho-acceptor, posed to explore functional phosphosites from large- which was an upper bound for the emergence of scale data [40–42]. And a recent fungi study showed that phosphorylation status. For example, it was possible young phosphosites prevailed and contributed to the that some phosphosites in the old group might be major noisy of phosphorylation [19]. In this study, we in- phosphorylated in recent time. Therefore more com- vestigated whether the evolutionary age of phosphosites prehensive phosphorylation datasets covering different was also associated with their functionality in vertebrates species were necessary to infer the phosphorylation by dividing the phosphosites into old, median and young agemoreprecisely. groups based on their emergence time in the species tree. We found several evidences that the functional poten- Conclusions tial of phosphosites was increased with their evolution- In summary, the current study evaluated the func- ary age especially for serine and threonine in the tionality of human and mouse phosphosites, indicat- disordered regions, which was separately proved in the ing that newly emerged vertebrate phosphosites were human and mouse data. Firstly, the old group harbored more likely to be non-functional, especially for serine a higher proportion of known-functional phosphosites and threonine phosphorylation in disordered regions. and a lower proportion of polymorphic phosphosites Meanwhile, our study provided a new evolutionary than the young group. Secondly, in VFMs where phos- scenario of phosphorylation. New phosphosites were phorylation was more essential, there was a lower frac- caused by the high turnover rate in disordered re- tion of young phosphosites than in BFMs. Thirdly, fewer gions, and only those with indispensable functions phosphosites originating from Asp/Glu which could could result in the innovation of phenotypic fitness mimic their ancestral functions, were evolutionarily during the long evolutionary time. We also provided young. Finally, the phosphorylation level and breadth useful insights to screen out the functional ones from among different tissues were increased with evolutionary massive phosphosite data by a combination of the age. And the phosphorylation level is highly correlated evolutionary age, ancestral state, phosphorylation level with the breadth. and breadth. Miao et al. BMC Genomics (2018) 19:431 Page 7 of 9 Methods control dataset was filtered by the following criterion: 1) Classification of Phosphosites the sites overlapped with the verified and predicted One hundred ninety-six thousand ninety-three human phosphosites from dbPTM were excluded; 2) we elimi- phosphosites of 16,031 proteins and 62,274 mouse phos- nated the potential phosphosites predicted by Networkin phosites of 9600 proteins were gathered from the Phos- with the score greater than 2 were eliminated. According phoSitePlus database (version: v2015.12) separately [26]. to the method of Pearlman et al., we calculated the ratio To classify the phosphosites by the evolutionary ages, 8 and p-value of enrichment for each ancestral amino acid vertebrate species were selected, including H. sapiens, P. by bootstrap sampling with 1000 times. For every sam- troglodytes, M. musculus, R. norvergicus, B. taurus, G. pling, the distribution of phosphosites in disordered re- gallus, X. tropicalis and D. rerio. We built the species gions was balanced between phosphosites and control tree and got the divergence time from TimeTree.org datasets. [43]. The orthologous proteins of these species were found from InParanoid8, and multiple sequence align- Phosphorylation quantification ments were performed with Clustal-omega [44, 45]. Fourteen thousand nine hundred fifty-four mouse phos- Then, the FastML.v3.1 software was used to reconstruct phosites with quantification data were gathered in nine the ancestral sequences of phosphorylation proteins [46, tissues [39]. Overlapping with the age groups in disor- 47]. The phophosites were filtered by the following rules: dered regions, we respectively got 2947, 2359 and 2310 the number of orthologous sequences should be more sites in the old, median and young groups. Among the than 4 to infer the ancestral state. Finally, we got 115,780 age groups, we respectively compared the maximum human phosphosites and 42,244 mouse phosphosites re- phosphorylation level across the tissue and the number spectively (Additional file 1: Table S1). of tissues where the phosphosites were expressed. In order to exclude the influence of protein abundance and Local structure breadth, the ANOVA analysis was performed ting both The VSL2 software was used to identify the phosphosites the phosphorylation and protein quantification into in ordered or disordered regions [48]. It calculated the account. disorder score (between 0 and 1) for each amino acid based on 26 amino acids-based features for a given pro- tein sequence. The score of phosphosites in disordered Additional files regions should be greater than 5. Otherwise, the phos- phosites were in ordered regions. Additional file 1: Figure S1. The shared phosphosites between human and mouse. Figure S2. The functional annotations for human tyrosine phosphosites and serine/threonine phosphosites in ordered regions. Functional annotation Figure S3. The difference of conservation and functional annotations The human and mouse functional phosphosites were between BFMs and VFMs. Figure S4. Enrichment analysis of amino acids collected from a file named ‘Regulatory_sites.gz’ in the in the transition to phosphosites for human. Figure S5. Enrichment analysis of amino acids in the transition to phosphosites for mouse. PhosphoSitePlus database (file name: Regulatory_sites, Figure S6. The distribution of protein abundance and breath in mouse. version: v2015.12) [26]. We respectively got human and Figure S7. The phosphosites with both low phosphorylation level and mouse polymorphic phosphosites by overlapping with breadth significantly enriched in young group compared with old group. Table S1. The number distribution of human and mouse phosphosites. human and mouse SNPs annotated from dbSNP138 (DOCX 226 kb) using ANNOVAR (version: 20150619, GRCH38 and GRCM38) [49]. The definition of BFMs and VFMs for phosphorylation proteins refered to Wang et al. [25]. Abbreviations And Rate4sites [50] was used to calculate the evolution- phosphosites: phosphorylation sites; VFMs: vertebrate-specific functional modules; BFMs: basic functional modules; MYA: million years ago; ary rate of phosphosites and compared the rates between Asp: aspartate; Glu: glutamate; Lys: lysine. BFMs and VFMs. Acknowledgements Enrichment of ancestral state The authors thank the members of the YL lab for helpful discussion. We explored whether phosphosites in the median and young group could significantly originate from some Funding amino acid residues compared with corresponding non- This work was supported by the National Key R&D Program of China phosphosites. The ancestral state of those sites was re- (2017YFA0505500, 2016YFC0901704), the National Natural Science trieved from the FastML.v3.1 results. For the control Foundation of China (31301032), the Youth Innovation Promotion Association CAS (2017325) and the Shanghai Postdoctoral Scientific Program dataset, we collected serine, threonine and tyrosine sites (13R21417300). The funding bodies had no role in the design of the study which were not known to be phosphorylated from the and collection, analysis and interpretation of data and in writing the same proteins within the age groups. We filtered the manuscript. Miao et al. BMC Genomics (2018) 19:431 Page 8 of 9 Availability of data and materials 14. Jiménez JL, Hegemann B, Hutchins JR, Peters JM, Durbin R. A systematic The phosphorylation datasets analyzed during the current study are comparative and structural analysis of protein phosphorylation sites based available in the PhosphoSitePlus database (https://www.phosphosite.org/ on the mtcPTM database. Genome Biol. 2007;8(5):1–20. homeAction.action)[26]. 15. Gnad F, Ren S, Cox J, Olsen JV, Macek B, Oroshi M, Mann M. PHOSIDA (phosphorylation site database): management, structural and evolutionary investigation, and prediction of phosphosites. Genome Biol. 2007;8(11):R250. Authors’ contributions 16. Lienhard GE. Non-functional phosphorylations? Trends Biochem Sci. 2008; BM was responsible for evolutionary analysis of the phosphorylation data. QX 33(8):351. carried out the polymorphic phosphorylation sites. WC carried out the data 17. Landry CR, Levy ED, Michnick SW. Weak functional constraints on collection and interpretation. ZW conceived this research and drafted the phosphoproteomes. Trends in Genetics Tig. 2009;25(5):193. manuscript. YL conceived the research and revised the paper. All authors 18. Tan CS, Bader GD. Phosphorylation sites of higher stoichiometry are more read and approved the final manuscript. conserved. Nat Methods. 2012;9(4):317. author reply 318 19. Romain A, RAR-M S, Haas KM, Hsu JI, Viéitez C, Solé C, Swaney DL, Stanford Ethics approval and consent to participate LB, Liachko I, Böttcher R, Dunham MJ, de Nadal E, Posas F, Beltrao P, Villén J. Not applicable. Evolution of protein phosphorylation across 18 fungal species. Science. 2016;354(6309):229–32. Competing interests 20. Ubersax JA, Jr FJ. Mechanisms of specificity in protein phosphorylation. Nat The authors declare that they have no competing interests. Rev Mol Cell Biol. 2007;8(7):530–41. 21. Pawson T, Scott JD. Protein phosphorylation in signaling–50 years and counting. Trends Biochem Sci. 2005;30(6):286. Publisher’sNote 22. Karin M. Signal transduction from the cell surface to the nucleus through Springer Nature remains neutral with regard to jurisdictional claims in the phosphorylation of transcription factors. Curr Opin Cell Biol. 1994;6(3): published maps and institutional affiliations. 23. Johnson LN, Barford D. The effects of phosphorylation on the structure and Author details function of proteins. Annual Review of Biophysics & Biomolecular Structure. Key Lab of Computational Biology, CAS-MPG Partner Institute for 1993;22(22):199. Computational Biology, Shanghai Institutes for Biological Sciences, Chinese 24. Hardie DG. AMP-activated/SNF1 protein kinases: conserved guardians of Academy of Sciences, Shanghai, People’s Republic of China. University of cellular energy. Nat Rev Mol Cell Biol. 2007;8(10):774–85. Chinese Academy of Sciences, Beijing, People’s Republic of China. School of 25. Wang Z, Ding G, Geistlinger L, Li H, Liu L, Zeng R, Tateno Y, Li Y. Evolution Life Science and Technology, Tongji University, Shanghai, People’s Republic of protein phosphorylation for distinct functional modules in vertebrate of China. Shanghai Center for Bioinformation Technology, Shanghai genomes. Mol Biol Evol. 2011;28(3):1131–40. Industrial Technology Institute, Shanghai, People’s Republic of China. 26. Hornbeck PV, Kornhauser JM, Tkachev S, Zhang B, Skrzypek E, Murray B, Collaborative Innovation Center for Genetics and Development, Fudan Latham V, Sullivan M. PhosphoSitePlus: a comprehensive resource for University, Shanghai, People’s Republic of China. investigating the structure and function of experimentally determined post- translational modifications in man and mouse. Nucleic Acids Res. 2012;40: Received: 22 December 2017 Accepted: 12 April 2018 Database issue):261–70. 27. Manning G, Whyte DB, Martinez R, Hunter T, Sudarsanam S. The protein kinase complement of the human genome. Science. 2002;298(5600):1912. References 28. Iakoucheva LM, Radivojac P, Brown CJ, O'Connor TR, Sikes JG, Obradovic Z, 1. Wapinski I, Pfeffer A, Friedman N, Regev A. Natural history and evolutionary Dunker AK. The importance of intrinsic disorder for protein phosphorylation. principles of gene duplication in fungi. Nature. 2007;449(7158):54–61. Nucleic Acids Res. 2004;32(3):1037. 2. Beltrao P, Trinidad JC, Fiedler D, Roguev A, Lim WA, Shokat KM, Burlingame 29. Davis FP. Phosphorylation at the Interface. Structure. 2011;19(12):1726. AL, Krogan NJ. Evolution of phosphoregulation: comparison of 30. Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, Sirotkin K. phosphorylation patterns across yeast species. PLoS Biol. 2009;7(6):e1000134. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2000; 3. Landry CR, Lemos B, Rifkin SA, Dickinson WJ, Hartl DL. Genetic properties 29(1):308. influencing the evolvability of gene expression. Science. 2007;317(5834):118. 31. Pearlman SM, Serber Z, Ferrell JE Jr. A mechanism for the evolution of 4. Soon WW. Hariharan M. High-throughput sequencing for biology and phosphorylation sites. Cell. 2011;147(4):934–46. medicine: Snyder MP; 2013. 32. Kurmangaliyev YZ, Goland A, Gelfand MS. Evolutionary patterns of 5. Kircher M, Kelso J. High-throughput DNA sequencing–concepts and phosphorylated serines. Biol Direct. 2011;6(1):1–7. limitations. Bioessays News & Reviews in Molecular Cellular & 33. Besant P, Attwood P, Piggott M. Focus on Phosphoarginine and Developmental Biology. 2010;32(6):524–36. Phospholysine. Curr Protein Pept Sci. 2009;10(6):536–50. 6. Purcell S, Neale B, Toddbrown K, Thomas L, Ferreira MA, Bender D, Maller J, 34. Bertran-Vicente J, Serwa RA, Schumann M, Schmieder P, Krause E, Sklar P, de Bakker PI, Daly MJ. PLINK: a tool set for whole-genome Hackenberger CP. Site-specifically phosphorylated lysine peptides. J Am association and population-based linkage analyses. Am J Hum Genet. 2007; Chem Soc. 2014;136(39):13622–8. 81(3):559–75. 35. Cieśla J, Frączyk T, Rode W. Phosphorylation of basic amino acid residues in 7. Durbin R: 1000 genome project: a map of human genome variation from proteins: important but easily missed. Acta Biochim Pol. 2011;58(2):137. population scale sequencing. 2010. 36. Thorsness PE, Koshland DE. Inactivation of isocitrate dehydrogenase by 8. Cohen P. The origins of protein phosphorylation. Nat Cell Biol. 2002;4(5): phosphorylation is mediated by the negative charge of the phosphate. J E127. Biol Chem. 1987;262(22):10422–5. 9. Olsen JV, Blagoev B, Gnad F, Macek B, Kumar C, Mortensen P, Mann M. 37. Cooper JA, Sefton BM, Hunter T: [42] Detection and quantification of Global, in vivo, and site-specific phosphorylation dynamics in signaling phosphotyrosine in proteins. Methods Enzymol 1983, 99(99):387. networks. Cell. 2006;127(3):635. 10. Johnson LN. The regulation of protein phosphorylation. Biochem Soc Trans. 38. Levy ED, Michnick SW, Landry CR. Protein abundance is key to distinguish 2009;37(4):627–41. promiscuous from functional phosphorylation based on evolutionary information. Philos Trans R Soc Lond Ser B Biol Sci. 2012;367(1602):2594– 11. Malik R, Nigg EA, Körner R. Comparative conservation analysis of the human mitotic phosphoproteome. Bioinformatics. 2008;24(12):1426. 12. Chen SC, Chen FC, Li WH. Phosphorylated and nonphosphorylated serine 39. Huttlin EL, Jedrychowski MP, Elias JE, Goswami T, Rad R, Beausoleil SA, Villén and threonine residues evolve at different rates in mammals. Molecular J, Haas W, Sowa ME, Gygi SP. A tissue-specific atlas of mouse protein Biology & Evolution. 2010;27(11):2548–54. phosphorylation and expression. Cell. 2010;143(7):1174. 13. Boekhorst J, Breukelen BV, Heck AJ, Snel B. Comparative 40. Xiao Q, Miao B, Jie B, Zhen W, Li Y. Corrigendum: prioritizing functional phosphoproteomics reveals evolutionary and functional conservation of phosphorylation sites based on multiple feature integration. Sci Rep. phosphorylation across eukaryotes. Genome Biol. 2008;9(10):R144. 2016;6:24735. Miao et al. BMC Genomics (2018) 19:431 Page 9 of 9 41. Shen N, Wang Z, Ge D, Zhang G, Li Y. Prediction of functional phosphorylation sites by incorporating evolutionary information. Protein & Cell. 2012;3(9):675–90. 42. Beltrao P, Albanèse V, Kenner LR, Swaney DL, Burlingame A, Villén J, Lim WA, Fraser JS, Frydman J, Krogan NJ. Systematic functional prioritization of protein posttranslational modifications. Cell. 2012;150(2):413. 43. Hedges SB, Marin J, Suleski M, Paymer M, Kumar S. Tree of life reveals clock-like speciation and diversification. Molecular Biology & Evolution. 2015;32(4):835–45. 44. Sievers F, Wilm A, Dineen D, Gibson TJ, Karplus K, Li W, Lopez R, Mcwilliam H, Remmert M, Söding J. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal omega. Mol Syst Biol. 2011;7(7):539. 45. Sonnhammer EL, G Ö: InParanoid 8: orthology analysis between 273 proteomes, mostly eukaryotic. Nucleic Acids Res 2015, 43(Database issue): 234–239. 46. Pupko T, Pe'Er I, Shamir R, Graur D. A fast algorithm for joint reconstruction of ancestral amino acid sequences. Molecular Biology & Evolution. 2000; 17(6):890. 47. Haim A, Osnat P, Adi DF, Ofir C, Gina C, Oren Z, Tal P. FastML: a web server for probabilistic reconstruction of ancestral sequences. Nucleic Acids Res. 2012;40:Web Server issue):580–4. 48. Peng K, Radivojac P, Vucetic S, Dunker AK, Obradovic Z. Length-dependent prediction of protein intrinsic disorder. Bmc Bioinformatics. 2006;7(1):1–17. 49. Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010; 38(16):e164. 50. Tal Pupko REB, Mayrose I, Glaser F, Ben-Tal N. Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues. Bioinformatics. 2002; 18:S71–7.

Journal

BMC GenomicsSpringer Journals

Published: Jun 4, 2018

References

You’re reading a free preview. Subscribe to read the entire article.


DeepDyve is your
personal research library

It’s your single place to instantly
discover and read the research
that matters to you.

Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.

All for just $49/month

Explore the DeepDyve Library

Search

Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly

Organize

Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.

Access

Get unlimited, online access to over 18 million full-text articles from more than 15,000 scientific journals.

Your journals are on DeepDyve

Read from thousands of the leading scholarly journals from SpringerNature, Elsevier, Wiley-Blackwell, Oxford University Press and more.

All the latest content is available, no embargo periods.

See the journals in your area

DeepDyve

Freelancer

DeepDyve

Pro

Price

FREE

$49/month
$360/year

Save searches from
Google Scholar,
PubMed

Create lists to
organize your research

Export lists, citations

Read DeepDyve articles

Abstract access only

Unlimited access to over
18 million full-text articles

Print

20 pages / month

PDF Discount

20% off