Background: Mitogen-activated protein kinase (MAPK) cascades play critical functions in almost every aspect of plant growth and development, which regulates many physiological and biochemical processes. As a middle nodal point of the MAPK cascades, although evolutionary analysis of MKK from individual plant families had some reports, their evolutionary history in entire plants is still not clear. Results: To better understand the evolution and function of plant MKKs, we performed systematical molecular evolutionary analysis of the MAPKK gene family and also surveyed their gene organizations, sequence features and expression patterns in different subfamilies. Phylogenetic analysis showed that plant MAPKK fall into five different groups (Group A–E). Majority orthology groups seemed to be a single or low-copy genes in all plant species analyzed in Group B, C and D, whereas group A MKKs undergo several duplication events, generating multiple gene copies. Further analysis showed that these duplication events were on account of whole genome duplications (WGDs) in plants and the duplicate genes maybe have undergone functional divergence. We also found that group E MKKs had mutation with one change of serine or theronine might lead to inactivity originated through the ancient tandem duplicates in monocots. Moreover, we also identified MKK3 integrated NTF2 domain that might have gradually lost the cytoplasmic-nuclear trafficking activity, which suggests that they may involve with the gene function more and more sophistication in the evolutionary process. Moreover, expression analyses indicated that plant MKK genes play probable roles in UV-B signaling. Conclusion: In general, ancient gene and genome duplications are significantly conducive to the expansion of the plant MKK gene family. Our study reveals two distinct evolutionary patterns for plant MKK proteins and sheds new light on the functional evolution of this gene family. Keywords: MAPK kinase, Gene duplication, Evolution, Gene expression, NTF2 domain, Docking site Background mechanisms to perceive the cues from environment, to Plants have evolved intricate mechanisms to respond vari- transduce and amplify the signal, and to respond to stress ous perturbations or stresses of external environment to at the molecular, cellular and physiological levels . Pro- maintain normal growth and development. Plants cannot tein modification is a primary mode of signal transduction move its position to avoid the adverse environments and such as phosphorylation. Thesecompounds execute diverse can only adapt to the abiotic or biotic stresses in situ. signaling cascades in post-translational level by means of Therefore, plants have developed a set of precise regulatory catalyzing the addition of phosphategroupstoserineand threonine/tyrosine residues in their target proteins, which * Correspondence: email@example.com switch downstream stress-responsive genes on or off in Shanghai Key Laboratory of Plant Functional Genomics and Resources, both prokaryotic and eukaryotic cells . A tripartite Shanghai Chenshan Botanical Garden, Shanghai, China mitogen-activated protein kinase (MAPK) signaling cascade Shanghai Chenshan Plant Science Research Center, Chinese Academy of Sciences, Shanghai, China © The Author(s). 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. Jiang and Chu BMC Genomics (2018) 19:407 Page 2 of 18 is sophisticated multienzyme complexes that are highly (JA)-mediated signal transduction pathway  and blue conserved in evolution and fundamental signaling trans- light-mediated signaling . MKK4/5, group C MKK duction pathways that play important roles in stress resist- gene members, were revealed to regulate in stress re- ance, hormonal responses and developmental regulation sponses  and ABA responses . With the excep- [3–5]. A canonical MAPK cascade is composed of three tion of MKK7 and MKK9, which were also reported be continuously acting protein kinases. The MAPKK kinase related to polar auxin transport and ethylene signaling (MAPKKK or MEKK) phosphorylates and activates the pathways, respectively [26, 27], there is little information MAPK kinase (MAPKK, MEK or MKK) on the S/T-X about the function of other group D MKKs, which may 3– -S/T motif which is located in their phosphorylation re- due to the low-level transcripts of them. Previously, it gion. Subsequently, the activated MKK can phosphorylate has been suggested that MKK10 had not biologically MAPK (also called MPK) at their threonine and tyrosine function because of the mutation in active site. However, (TXY) residues located in the MAPK activation loop [3, 6]. recent report suggested that OsMKK10–2 plays a crucial Moreover, scaffold proteins, shared docking domains and role in salicylic-acid (SA) signal-mediated pathogen adaptor or anchoring proteins can mediate the formation defense response . and integrity of a specific MAPK cascade [7–9]. Numerous So far, evolutionary history studies of MAPK cascades in reports indicated that the MAPK cascades were connected plants have executed by using a limited number of plants with signaling pathways activated by many critical environ- species. Some studies used MAPK sequences from several mental factors, such as salinity, low temperature, drought, representative plants had revealed a phylogeny covering > heavy metals, high pH, biotic stresses and plant hormone 800 million years of evolution  and showed novel acti- responses [5, 10]. vation loop variants . Gene duplication is a basic With the completion of the genome sequence in a var- source of new genes in evolution, providing novel oppor- iety of plants, many members of MAPK cascades have tunities for evolutionary success , while whole genome been annotated. MAPK protein was originally identified duplications (WGDs) is one of the important evolutionary in the model plant Arabidopsis thaliana. Subsequent feature. Evidence for WGDs has been detected at least comprehensive bioinformatics analyses of these protein one time in most plant taxa, while it has at least two families have identified in plant genomes, including A. rounds of WGDs that are assumed to have place approxi- thaliana , Oryza sativa , Brassica napus , mately 60–70 and 23–43 Myr ago in A. thaliana [32, 33]. apple , Brachypodium distachyon , and so on. The results of WGDs are gene amplification double of all However, these researches have been limited to one or a genes [31, 32]. Ever since a WGD event, plants take up re- few plant with no change in this status. To date, know- moving most redundant gene copies in a long evolution- ledge about the function of MAPK cascade, especially ary process, but some duplicated copies are retained MKKs that are the middle nodal point of the MAPK cas- depending on their environmental adaptation [32, 34], cade, is rather limited in several species, including Ara- which is the cause of varying numbers of multigene family bidopsis. MKKs containing the least members which is members between plant species. only half of MAPKs may be able to activate more than In contrast to MAPKs, MKK’s evolutionary across differ- one MAPK protein which indicates a large number of ent plant species has not been reported. Fortunately, the se- combinatorial possibilities to form specific MAPK cas- quence availability of plant genomes allowed us to cade . The 10 MKKs encoded in the Arabidopsis investigate the evolutionary history of MKKs from different genome, among which one (MKK10) have one mutation plant species. In this study, we have identified MAPKK in conserved S/T-X -S/T motif, can be classified into genes, examined their structural information, and per- four groups (A to D) based on their evolutionary rela- formed phylogenetic analyses from major plant lineages. tionship. MKK1 and MKK2 which belongs to group A in Our analyses identify a comprehensive overview of the phy- Arabidopsis can activate downstream MPK4 in re- logenetics and showed that two distinct evolutionary pat- sponses to cold and salinity, and also mediate innate im- terns in different MKK groups, as well as functional munity responses . Moreover, MKK1 also mediates evolution of MKK3 have distinct difference compared to H O metabolism via MPK6-coupled signaling in Arabi- those of other groups. Expression patterns of these genes 2 2 dopsis . MKK6 was identified to activate MPK13 in are analyzed in five taxa with various tissues and treatments, mutant yeasts and NQK1 which is an MKK6 ortho- and the potential function of these genes is discussed. log in tobacco was involved in the formation of the cell plate at the late M-phase of cytokinesis . MKK3, the Results only member of group B MKK, which has a nuclear Identification of MAPKK genes in 51 plant species based transfer factor 2 (NTF2) domain in their 3′ extension C- on the conserved S/T-X -S/T domain terminus region that mediates nuclear import of In order to investigate the diversity and evolution of RanGDP , plays a pivotal role in jasmonate MAPKK protein architectures across plants, identification Jiang and Chu BMC Genomics (2018) 19:407 Page 3 of 18 of the MAPKK gene family was performed from 51 gen- including mosses, ferns, gymnosperms and eudicots, ome publicly available plants that contained the maximum whereas without monocots, which suggested that it was number of species across the plant lineages starting from lost the divergence between monocots and dicots. While the unicellular eukaryote alga to the angiosperm A. thali- it only contained members of group C in gymnosperms ana (Table 1). We found that the numbers of MAPKKs and angiosperms, implying that they originated in the varied from species to species across the entire plant king- common ancestor of spermophyte. Furthermore, most dom. 365 non-redundant MKK sequences were retrieved orthology groups in each species maintained single copy in total (Table 1, Additional file 1). The tetraploid soybean genes in group B MKKs. genome contained the largest number MAPKK genes, whereas the lower alga plant Volvox carteri only contained Common conserved domain compositions and genomic one. In addition, B. distachyon (12), Capsella rubella (11), analysis within plant MAPKK groups Gossypium raimondii (11), Manihot esculenta (11), Popu- To get a better overview of the characteristics of differ- lus trichocarpa (11) and Setaria italic (12) contained ent plant MAPKKs, we further analyzed their sequences higher number of MKKs (Table 1,Additional file 1). features. As expected, our motif organization analyses using InterProScan database revealed that putative plant Phylogenetic classification of MAPKKs into five MKK proteins contained common motifs in the same subfamilies group, suggestive of functional similarities within each In order to elucidate the evolution history of plant group. As illustrated in Fig. 2a, all of groups contain MAPKK proteins, comprehensive analyzing the phyl- ATP binding site (IPR017441), protein kinase domain ogeny of all identified 365 protein sequences from 51 se- (IPR002290) and active site (IPR008271). Interestingly, quenced plant species to date plus the MKKs of Homo group B MKKs have domain of Nuclear transport factor 2 sapiens and two yeasts were performed using maximum (NTF2; IPR018222) near the carboxyl terminus, which is likelihood (ML) methods. According to phylogenetic involved in the nuclear import of cargo proteins. These analyses and subcellular localization prediction results, data provided us with clues for potential function of MKK the plant MAPKKs can be divided into five main clades proteins, including the idea that group B (MKK3) protein and designated as A, B, C, D and E, respectively (Fig. 1; might participate in cytoplasmic-nuclear trafficking. Additional file 1), which is inconsistent with what was In order to understand the possible gene structural reported for Arabidopsis and rice MKKs. The group A, relationship among MKK orthologues and paralogues, B and C contains MKK1/2/6, MKK3 and MKK4/5, re- we also analyzed the exon/intron organizations of differ- spectively. It is noteworthy that the MKK7/8/9/10 se- ent MAPKK genes using GSDS software. Within each quences were previously reported to belong to group D, group, most members displayed a similar exon/intron whereas our results suggested that these subfamilies organization in the aspect of intron number, exon should be divided into group D (MKK7/8/9; length, and intron phase. Notably, more similarities were OG5_150257) and E (MKK10; OG5_211998) in the observed in conserved regions such as protein kinase study according to phylogenetic analyses with the domain and active site (Fig. 2a). The numbers of intron addition of the Orthomcl analyses (Additional file 2). also varied from 0 to 16 for another among different Moreover, six MKKs of lower algae plants including groups with wide divergence (Fig. 2a). Group A MKKs CrMKK2, MpMKK6, OlMKK6, PtiMKK1, SjMKK1 and possesses introns ranged from 4 (DcMAPKK6–2 and VcMKK6 with human and yeasts plus group B MKKs PaMAPKK6–2)to 14(AtrMAPKK2) and dominated were on the same evolutionary branch, suggested that with 7 (60%), while the numbers of intron in group B ancient MKKs have existed before the split of animal MKKs ranged from 2 (PaMAPKK3)to16(AtrMAPKK3) and plant (Fig. 1). Although the six algae MKKs were and 8 occupied the main points (68%; Additional files 1 not on the same evolutionary branch with the group A and 3). Remarkably, most groups C, D and E MKKs con- MKKs, they were highly homologous with their corre- tain intronless (Fig. 2a, Additional files 1 and 3). sponding Arabidopsis MKKs. So, we still named them Next, we further examined the molecular weights (Mw) using the orthologous based on the names of Arabidop- and isoelectric point (pI) of different MAPKK proteins sis, such as CrMAPKK2. We defined 31 (6, 5, 7, 7 and 6) using the online version of Compute pI/Mw tool. The Mw orthology groups in the five subfamilies (group A, B, C, of MAPKK proteins varied from 11.276 (FvMKK8) to D and E), respectively. Interestingly, 2 groups (A and B) 146.588 (TcMKK2) kDa and the pI varied from 5.06 had genes start from chlorophytes, indicating that more (ZmMKK3–2) to 10.15 (VvMKK5) (Additional file 4). The than one MAPKK gene might have occurred prior to the pI of groups D and E MKKs were in the basic ranges, split of green algae and land plants. Nevertheless, group while those of group A and group B MKKs were ranges C and D had genes from gymnospermae and moss, re- from acidic to slightly acidic and group C MKKs were re- spectively. In group D, with members from land plants sided within slight alkaline. The average amino acid Jiang and Chu BMC Genomics (2018) 19:407 Page 4 of 18 Table 1 Table representing genome size of different plant species and number of MAPKK genes present per genome (species) Group Order Species Abbr. Genome size (Mbs) Copy number Database* Angiosperm Brassicales Arabidopsis thaliana At 135 10 PLAZA Brassica napus Bn 630 8 NCBI Brassica oleracea Bo 630 10 Oil Crops Genomics Database Brassica rapa Br 283.8 14 PLAZA Capsella rubella Cru 134.8 11 PLAZA Carica papaya Cp 135 9 PLAZA Thellungiella parvula Tp 140 9 PLAZA Malvales Gossypium raimondii Gr 761.4 11 PLAZA Theobroma cacao Tc 330.8 8 PLAZA Sapindales Citrus sinensis Cs 319 7 PLAZA Myrtales Eucalyptus grandis Eg 691 6 PLAZA Malpighiales Jatropha curcas Jc 410 10 Jatropha Genome Database Manihot esculenta Me 760 11 PLAZA Populus trichocarpa Pt 422.9 11 PLAZA Ricinus communis Rc 400 6 PLAZA Fabales Glycine max Gm 975 14 PLAZA Lotus japonicus Lj 472 7 PLAZA Medicago truncatula Mt 257.6 6 PLAZA Phaseolus vulgaris Pv 521.1 9 Phytozome Cucurbitales Citrullus lanatus Cl 425 6 PLAZA Cucumis sativus Csa 203 6 Cucurbit Genomics Database Cucumis melo Cm 375 6 PLAZA Rosales Fragaria vesca Fv 240 5 PLAZA Malus domestica Md 881.3 9 PLAZA Prunus persica Ppe 451.9 10 PLAZA Vitales Vitis vinifera Vv 487 5 PLAZA Solanales Capsicum annuum Ca 3480 5 Pepper Genome Database Solanum lycopersicum Sl 900 5 PLAZA Solanum melongena Sme 1093 4 Eggplant Genome DataBase Solanum tuberosum St 800 5 PLAZA Ericales Actinidia chinensis Ac 616.1 9 Kiwifruit genome sequence Gentianales Coffea canephora Cc 710 8 Coffee Genome Hub Caryophyllales Beta vulgaris Bv 566.6 5 PLAZA Dianthus caryophyllus Dc 622 8 Carnation DB Poales Brachypodium distachyon Bd 272 12 PLAZA Hordeum vulgare Hv 5100 6 PLAZA Musa acuminata Ma 472 9 PLAZA Oryza sativa Os 372 8 PLAZA Setaria italica Si 405.7 12 PLAZA Sorghum bicolor Sb 697.5 7 PLAZA Zea mays Zm 2500 8 PLAZA Amborellales Amborella trichopoda Atr 748 6 PLAZA Gymnosperm Coniferales Picea abies Pa 1960 7 Spruce genome project Pteridophyte Selaginellales Selaginella moellendorffii Sm 212.5 3 PLAZA Jiang and Chu BMC Genomics (2018) 19:407 Page 5 of 18 Table 1 Table representing genome size of different plant species and number of MAPKK genes present per genome (species) (Continued) Group Order Species Abbr. Genome size (Mbs) Copy number Database* Bryophyte Funariales Physcomitrella patens Pp 480 7 PLAZA Algae Chlamydomonadales Chlamydomonas reinhardtii Cr 130 2 PLAZA Volvox carteri Vc 125.4 1 PLAZA Mamiellales Micromonas pusilla Mp 22 2 PLAZA Ostreococcus lucimarinus Ol 13.2 1 PLAZA Naviculales Phaeodactylum tricornutum Pti 27.4 1 PLAZA Laminariales Saccharina japonica Sj 537 2 Saccharina Genome Project  Vertebrates Primates Homo sapiens Hs 2996.43 7 UCSC Human Gene Fugi Saccharomycetales Saccharomyces cerevisiae Sc 12.1249 4 SGD Schizosaccharomycetales Schizosaccharomyces pombe Sp 12.5913 3 PomBase *Database websites: PLAZA, http://bioinformatics.psb.ugent.be/plaza/; NCBI, https://www.ncbi.nlm.nih.gov/; Phytozome, https://phytozome.jgi.doe.gov/pz/; Pepper Genome Database, http://peppersequence.genomics.cn/page/species/index.jsp; Jatropha Genome Database, http://www.kazusa.or.jp/jatropha/; Cucurbit Genomics Database, http://cucurbitgenomics.org/; BRAD, http://brassicadb.org/brad/index.php; Kiwifruit genome Database, http://bioinfo.bti.cornell.edu/cgi-bin/kiwi/ home.cgi; Eggplant Genome DataBase, http://eggplant.kazusa.or.jp/; Coffee Genome Hub, http://coffee-genome.org/; Carnation DB, http://carnation.kazusa.or.jp/; Spruce genome project, http://congenie.org/citation; SGD, http://www.yeastgenome.org/; UCSC Human Gene, http://genome.ucsc.edu/; PomBase, http://www.pombase.org/ Fig. 1 Maximum Likelihood phylogenetic trees of all plant MAPKKs. Phylogenetic analysis was carried with protein sequences for 365 MAPKK proteins from 51 plant species identified in this study Jiang and Chu BMC Genomics (2018) 19:407 Page 6 of 18 Fig. 2 Gene structure and sequence features of conserved MAPKK genes. (a) Gene structure and protein motif. The structure of an A.thaliana gene (indicated on the left) is shown as an example for each group (in parenthesis on the left). Protein motifs are shown as colored boxes, whereas introns of different phase are shown as colored vertical lines. Protein motif architectures of the full-length proteins were drawn based on a search of InterPro program. IPR002290 indicates protein kinase motif, IPR017441 and IPR008271 means the ATP binding site and active site, respectively. The exons are drawn to scale. (b) Sequence features shown in the form of web logos representing the conserved S/T-X -S/T motif and active site D(I/L/V)K motif of each group. The red and green stars indicate residues of functional or structural importance based on phylogenetic conservations. Logos were generated using the Weblogo3 application (http://weblogo.threeplusone.com/). (c) Multiple-sequence alignment of conserved S/T-X -S/T motif and active site D(I/L/V)K motif portion of plant MAPKKs. The green star shows the active site and the red star indicate the phosphorylation site of MAPKK proteins in different plant species. Species information can be found in Additional file 1 composition of MAPKK proteins were ranges from 0.85 S/T-X -S/T motif as the phosphorylation site play pivotal (tryptophan) to 10.38 (leucine) (Additional file 5). Remark- role in signal transduction pathway. In order to further ably, the average abundance of the most important amino identify consensus sequences that might be characteristic acids serine and threonine (S/T-X -S/T) which were in- of phylogenetic group, we then performed sequence logos volved in the phosphorylation function of MKKs were motif analyses using WebLogo 3 online tool for each 8.70 and 4.16, respectively. In addition, the average abun- group as an illustration of the sequences of ATP binding dance of the hydrophobic amino acids in MAPKKs were site and phosphorylation site of the protein kinase domain relatively higher than ones of other amino acids such as (Fig. 2b; Additional file 6). For examples, Fig. 2b showed alanine (6.51), isoleucine (5.85), leucine (10.38), proline the sequence information of active site D(I/L/V)K motif (6.35) and valine (6.35) (Additional file 5). and phosphorylation site S/T-X -S/T motif from each The formerly studies have identified several consensus group. The stars indicated residues were important to the sequences that is associated with the roles of structure or function of MAPKK. In all groups, the active site D(I/L/ function, including the S/T-X -S/T motif in the activation V)K motif were highly conserved despite occasional varia- loop , the docking site (K/R X L/IXL/I) in tions which contained novel active site, for instance, DFK 2-3 1–5 N-terminal domain [35, 36], the GxGxxGxV motif in the (MdMAPKK2 and FvMAPKK1) or DMK (BvMAPKK6), nucleotide binding domain (NB domain) and HK-X -ALK which suggested that most MAPKK have the kinase activ- motif in the ATP binding site , and D(I/L/V)K motif ity. Furthermore, we observed an atypical phosphorylation in active site . Specifically, the serine or threonine in site S/T-X -S/T motif in group E MAPKK proteins, 5 Jiang and Chu BMC Genomics (2018) 19:407 Page 7 of 18 suggesting that these MAPKK may not have biologically (group B), R-X-R-R/K-R-X -L-X-L-X-L (group C), function because the prior viewpoint that a mutation R-X-R-R-X -L-X-L (group D) and R-X-R-R-X -L-X-L/I 1–3 1–4 resulting in a change of serine or theronine led to the (group E), respectively (Additional file 7). The presence of abolishment of kinase activity (Fig. 2c;[10, 37]). Hence, group specific D-sites in MAPKKs suggested that different most group E MAPKKs (MKK10) maybe constitute a new MAPKK targets were group specific in spite of those of group of inactive MAPKKs, although experimental tests group D and E MKKs had unusual similarity that might for kinase activity need to be further validated. It was be involved in the more close relationship of their evolu- noteworthy that group E MAPKKs only existed in angio- tion. So we can obtain the consequence that evolution of sperm plants, especially in monocots that have obviously different MAPKKs were orthologous based on group gene expansion. In addition, plant MKKs posseed a highly specific. conserved N-terminal extension which displayed a as- sumed MAPK docking site (D-site) K/R-K/R-K/RX L/ Gene number variation of the MKK gene family (1–5) I-X-L/I . During MAPK signaling, the existence of The phylogenetic analyses showed that 125, 53, 70, 75 short D-site that is used in common complementary re- and 42 MAPKKs fall into group A, B, C, D, and E, re- gion for recognition on the MAPK promotes the ability of spectively (Fig. 3). In non-seed plants, only nineteenth MKKs to recognize their cognate MAPKs. The MKKs total MKK genes were identified, including three in S. contained group specific conserved docking-sites, K/ moellendorffii and seven in P. patens, others in six differ- R-K-K-X -L-K-L/V (group A), L-K/R-K-K-L-X-P-L ent algae. In flower plants, the five MKK gene 1–5 Fig. 3 Phylogenetic relationships between the 51 plant species investigated in this study. The total number of MAPKK proteins and that of each groups identified in each plant genome is indicated on the right. The phylogenetic tree is modified from Phytozome (http://www.phytozome.net/). Red and blue stars indicate whole-genome duplication and triplication, respectively Jiang and Chu BMC Genomics (2018) 19:407 Page 8 of 18 subfamilies were found in most species, except B. olera- Table 2 Genetic distance between different groups of MAPKKs calculated based on the amino acid sequences with the Jones– cea, F. vesca and S. bicolor, in which group B MKK genes Taylor–Thornton (JTT) model were not detected, while group C MKK genes were not AB C D detected in E. grandis and F. vesca. The incomplete an- notation of the genome sequence may be the reason for A generating the result that group B or C MKK genes in B 1.225 this species has not discovered, because all other angio- C 1.202 1.365 sperms have these genes. Except for MKK10 (group E) D 1.294 1.429 0.924 was only gathered in the angiosperm plants, the E 1.452 1.628 1.111 0.996 remaining MAPKKs were found in most species we sur- veyed. Our analyses showed that the canonical group D MKKs were absence in monocots, whereas the atypical genetic distance between each group had not significant MAPKK (group E) existed in both monocots and dicots. difference, indicating that the sequence divergence be- It was noteworthy that the number of group specific tween each group MKKs were not significant. In addition, members in dicots is restricted to 0–2 genes, while the average overall mean distance of MAPKK was 1.109 monocots had strikingly large numbers, which might be (standard error 0.0602). associated with the WGDs of plant species in evolution- To elucidate the evolutionary basis of functional diver- ary process (Fig. 3). sification of each group MKKs, we have comprehensively Throughout the 51 complete genome sequences of evaluated the nonsynonymous-to-synonymous rates ra- plant species, the MKK gene numbers were greatly vari- tio (ώ =dN/dS) (Table 3) under different codon able, ranging from one in the green alga O. lucimarinus, substitution-based evolutionary models. The mean ώ P. tricornutum and V. carteri to 14 copies in B. rapa and values were equally low for group B (ώ = 0.173) and G. max (Table 1; Fig. 3). In many plant lineages, the in- group A (ώ = 0.228) MKKs, reflecting strong purify se- crease in MAPKK proteins was involved in polyploidy or lection during the two groups MKK evolution. However, ancient polyploidization events (Fig. 3;). For in- the mean ώ values of groups C, D and E MKKs were stance, there were 14 MKK genes in recently duplicated greater than 1, suggesting that the MKK genes are posi- G. max genomes in which two rounds of WGDs have tive selection during the evolution process. Molecular occurred. Similarly, the increase to identify the number evolutionary analysis revealed that group E and E of MAPKK genes in grasses was also probably attribut- MKKs had roughly the same ώ values in spite of remov- able to three ancient polyploidization events in their ing those of eudicots, indicating that positive selection evolutionary history . A remarkable exception was were contributed by monocots during group E MKK maize (Zea mays), which possessed only 8 MAPKK pro- evolution, while group A and B MKKs showed opposite teins in spite of WGDs event has occurred. Nevertheless, results (Table 3). Moreover, in order to evaluate the these events were likely earlier resulting in loss of more difference between whole sequences and functional duplicated genes (Fig. 3). An abnormally low number of motifs, we also investigated the ώ values using the motif MKK genes were identified in eggplant (S. melongena,4 sequences (Additional file 8) under different codon MKK genes) for which we can t find explicit explanation. Further analysis showed that the copy number variation Table 3 Molecular evolutionary analysis of the MAPKK genes in plant species was mainly attributable to the difference using their whole cDNA sequences in group A MKKs (Fig. 3), while the copy numbers were Group N Ka Ks ώ G + C content almost constant in other groups. For instance, only one A 125 0.17697 0.77494 0.228 0.442 copy was detected in most plants in group B and E A* 109 0.17783 0.76908 0.231 0.418 MKKs (Fig. 3). B 52 0.12819 0.73971 0.173 0.436 B* 42 0.13438 0.63808 0.211 0.437 Molecular evolutionary analyses The genetic distance between groups D and C MKKs was C 69 0.33292 0.17082 1.949 0.563 0.924, which was the lowest than other counterparts, indi- C* 56 0.31983 0.23440 1.364 0.531 cated that there were higher sequence similarity between D 75 0.37734 0.10843 3.48 0.518 the group D and C MKKs (Table 2), which was consistent E 42 0.39775 0.28353 1.403 0.591 with their evolutionary history and Orthomcl analysis E 21 0.30844 0.22131 1.394 0.711 (Additional file 2). While group B MKKs showed the high- N, number of sequences; Ka, the number of nonsynonymous substitutions per est genetic distance to group E than to other groups, indi- nonsynonymous site; Ks, the number of synonymous substitutions per cating that group B MKKs were less similar with group E ☆ synonymous site; ώ, Ka/Ks. Sequences from dicots were excluded. than other MKKs in sequence level. On the contrary, the *Sequences from monocots were excluded Jiang and Chu BMC Genomics (2018) 19:407 Page 9 of 18 substitution-based evolutionary models. The mean ώ events of group A MKK1, MKK2 and MKK6 took place values were equally low for all group MKKs when we used in the common ancestor of core eudicots corresponding the sequences of D(I/L/V)K motif including S/T-X -S/T with the γ WGD (Fig. 4;). The two duplications of motif, reflecting strong purify selection and indicating that group A MKKs in Brassicaceae were involved in the α/β D(I/L/V)K motif and S/T-X -S/T motif are very conserved WGDs within the Brassicaceae lineage (Additional file 11; during the all group MKK evolution process (Additional ). With the studies on the WGD events for many file 8). Interestingly, except group C MKKs, it had similar species were previously carried out, we collected these results in the analyses of D-site or NB domain and ATP data and their paleopolyploidy histories (Fig. 3). We then binding site (Additional file 8). In all, molecular evolution- assessed the duplication events to the influence of the ary analyses using the motif sequences revealed that the size in group A MAPKKs. The comparison implied that difference of ώ values in different group MKKs were con- the WGDs might be conducive to the expansion of the tributed by the sequences removing those of motifs which orthology groups from some species. In order to further indicated that plant MKKs were very conserved, especially confirm the roles of genome duplication to the gene their functional domains. family expansion, we carried out the co-relationship ana- lysis between rounds of genome duplication about Multiple duplication events were identified in each group MAPKK genes. The correlation coefficient was calcu- The major force of the evolution of different species comes lated as 0.549 (P < 0.01) for the group A MAPKKs. The from gene duplication, which causes the gene to generate results suggested that the WGD significantly contributed the gene families. In order to further understand the dupli- to the gene expansion for group A MKKs. cation and evolution events of the MAPKK genes, we also As mentioned above, group E MKK genes have a part investigated the duplicated genes in plant genome from mutation in the phosphorylation site S/T-X -S/T motif each orthology group (Additional file 9). The amount of from angiosperms. In angiosperm plants, group E MKKs duplicated genes represents the sizes of gene families, can be further divided into six subgroups, namely sub- which are known as paralogs. Compared with the species group E1-E6 (Additional file 13). Of these six subgroups, with non duplicated genomes, the duplicated genome subgroup E1-E3 and E4-E6 contained genes from eudi- plants are more likely lead to duplicated MAPKK genes. cots and monocots, respectively. Moreover, group D Thus G. max, B. rapa and B. distachyon possess more du- MKKs which was from land plants had not contain plicated genes. To further examine the evolution and dupli- genes in monocots lineage. Group E MKKs displayed a cation events of the group A MAPKK genes, we conserved evolutionary pattern in eudicots with one reconstructed phylogenetic trees using MEGA 6.0 with gene copy, which was consistent with the state that these additional sequences from other plants. As shown in Fig. 3, genes have a part of mutation in key active site. While all six representative green algae we surveyed contain one the number of MKK genes in monocots prominently in- or two copy, but terrestrial plants such as moss P. patens creased large members to 2–5 copies. Further syntenic have three or more copies, suggested that duplications analyses revealed that the monocots MKK10 multiple likely took place after green algae moving towards land. As gene copies were caused by tandem duplication in the an- described above, group A MKK genes have more copies cestor of corresponding lineage (Fig. 5;Additional file 14). than other groups (Fig. 3). So, group A MKK genes can be The members of these duplications were often in the same further divided into six groups based on the phylogenetic orthology groups . These duplicated genes were still in relationship in land plants, namely subgroup A1, A2, A3, tandem in three species we analyzed and they were even A4, A5 and A6 (Additional file 10). Of these six groups, multiplied during the segmental duplications of B. dis- subgroup A6 contained genes in moss, whereas subgroup tachyon, S. italica and S. bicolor. Between the two MKK A3 and A4 contained genes from monocots, and subgroup members, an Exportin-2 gene was often found (Fig. 5). A1, A2 and A5 from eudicots. Within the subgroup A2-A5 These results suggested that the tandem MKK10 gene clades, there were several independent duplications in cluster in Poaceae originated in the common ancestral grasses and dicots, respectively, indicating that the ancestor genomic contexts that result in modern monocot MKK of corresponding lineage may have happened to the dupli- genes. cation events (Additional file 10). Except group A and E MKKs, we also analyzed the To explore the reason of these duplication events, we phylogeny of other groups MKKs, respectively searched for the genomic regions containing the (Additional files 15, 16 and 17). Group C MKKs con- MAPKK genes which are possible synteny. Most dupli- tained genes from seed plants, whereas group B MKKs cate gene pairs occurred in syntenic genomic regions, contained genes from green plants. Group B MKKs suggesting that these multiple gene copies were caused showed one or low gene copy in all species that was a by whole genome or segmental duplications (Fig. 4; conserved evolutionary pattern. As a result, we can Additional files 11 and 12). Particularly, the duplication speculate that group B MKKs might have conserved and Jiang and Chu BMC Genomics (2018) 19:407 Page 10 of 18 Fig. 4 Examples of the detailed locations of representative pairs of genes duplicated in recent polyploidy events in the syntenic regions. At, A. thaliana; Bd, B. distachyon; Br, B. rapa; Gm, G. max; Os, O. sativa; Pt, P. trichocarpa; Pv, P. vulgaris; Sb, S. bicolor; Vv, V. vinifera; Zm, Z. mays; chr, chromosome ancient functions that descended from the same ances- Brassicaceae maybe contributed by segmental duplica- tor of green plants. Even so, some recent duplication tions or other forms. events also occurred. Within the group C and D MKKs, there was also one duplication event in Brassicaceae Expression profile of plant MKK genes indicates potential lineage (Additional files 16 and 17). Deeper syntenic roles in flower tissues and UV-B signaling analyses indicated that the group C MKK gene copies in A previous study showed that MKK8 and MKK10 have Brassicaceae arise from WGD in the ancestor of this not expressed in shoot apices (including flower buds), family (Additional file 18), while group D MKKs in mesophyll cells, mature leaves  and flowers Fig. 5 Evolutionary reconstruction analyses of the fate of an ancestral locus having Group E MKK genes in tandem position. Red arrowheads indicate MKK genes; Green arrowheads indicate Exportin-2 genes; blue arrowheads indicate DMRL synthase genes Jiang and Chu BMC Genomics (2018) 19:407 Page 11 of 18 (including developing young siliques) . Based on the MKK genes are up-regulated upon UV-B treatment, in- expression data of Arabidopsis from PLEXdb (AT40), we cluding AtMKK4 and AtMKK5. These results indicated found that all Arabidopsis MKK genes are expressed that plant MKK genes may be involved in UV-B related sig- (Fig. 6a). Except AtMKK8, most MKK genes have tran- naling. To validate the expression profile of plant MKK scribed in the flowering stage. With the exception of genes in UV-B signaling, we performed the RT-qPCR ex- MKK10, the Arabidopsis MKKs have a consensus S/ periments on RNA collected from different time points of T-X -S/T in their activation domain. Even MKK10 have B. distachyon plants under UV-B treatments. The results an atypical S/T-X -S/T motif show detectable expres- revealed that many MKK genes are up-regulated upon sion, indicating that they possibly have a new function UV-B treatment, including BdMKK4 and BdMKK10–5,es- other than becoming silenced. It is interesting that the pecially BdMKK10–5 originally have low-level transcripts expression of AtMKK10 is much higher in pollen com- in B. distachyon tissue (Additional file 20). These results in- pared with other tissues. Arabidopsis MKKs genetically dicated that BdMKK genesmaybealsorespond theUV-B exhibit the crosstalk in a variety of signaling pathways, stress. In a word, we can speculate that plant MKK genes therefore, a single mkk mutant under normal growth play potential roles in UV-B related signaling. conditions is very difficult to obtain a specific phenotype . A previous study showed that the ap1–15, ap2–6, ap3–6, ag-12, clv3–7, lfy-12 and ufo-1 mutants involve Discussion with flower development, whereas AtMKK10 was down Evolutionary history and gene duplication of the MAPKK regulated in these mutants (Fig. 6b), consistent with in plant lineages their expression in the flower. CLV has a potential role Our phylogenies (Fig. 1) showed that the MAPKK in the regulation of meristem size and has little gene family undergone at least two ancient duplica- expressed in the center of the meristem [44, 45]. The tions that lead to the five subfamilies. There was at bam (CLV homologs) mutant exhibited rather weak phe- least one gene in group A and B before the split of notypes and regulated the male gametophyte develop- the green alga and terrestrial plants, while the group ment, leading to almost completely sterile plants . D MKKs took place before the divergence of the land Therefore, AtMKK10 which is regulated by CLV pro- plants. Numerous lineage-specific duplicate genes in teins may participate in the process of pollen develop- land plants were likely also caused by the WGDs or ment (Fig. 6c). segmental duplications. The first duplication event re- In rice, all except OsMKK10–1 and OsMKK10–3 are sulted in theappearanceofthe groupC MKK genes expressed in four reproductive stages from the micro- lineages gave arise to after the divergence of vascular array data (Additional file 19). Moreover, all maize and plants. Subsequently, the second duplication event poplar MKK genes are expressed in different tissues took place during the divergence of angiosperms (Additional file 19). However, the transcriptional levels which caused the appearance of group E MKK genes. in PtMKKs have a great difference, while maize is not After the separation of gymnosperms and angio- obvious (Additional file 19). To further validate the sperms, deep duplications occurred at least once in transcripts of B. distachyon MKK genes in different each group (Fig. 3). In angiosperms, it has been re- tissues, we tested five different tissues that were a ported that an ancient WGD event has produced part of Bd21 seedling, root, leaf, stem and young multigene copies . So we observed a lot of gene caryopsis, to analyze the tissue specific expression and/or genome duplications in group A MKK genes patterns of MKK genes (Additional file 20). The re- in angiosperms (Fig. 3 and Additional file 10), which sults revealed that all except BdMKK10–4 are is the same as to the patterns of rhomboid and SET expressed in five different tissues and BdMKK6 and gene families [46, 47]. In eudicots, most species pos- BdMKK10–2 have higher transcriptional levels in sess 4–8 MAPKK genes, apart from a few exceptions young caryopsis than other tissues (Additional file 20). It above 10, such as B. rapa, P. trichocarpa and G. is interesting that previous study show that MKK10 may max. Compared with their counterpart species, it has not be biologically functional on account of the lack part been shown that these species have undergone extra of the phosphorylation site, but OsMKK10–2 can mediate or more recent WGDs (Fig. 3). For example, Brassica the response to pathogen defense. Not as expected, MPK6 experienced an extra recent whole-genome triplication can be phosphorylated by OsMKK10–2 which have the (WGT) event in comparison to A. thaliana in Brassi- kinase activity . Moreover, we surveyed the gene ex- cales , compliance with the principle of roughly pression of MKK under abiotic stresses that public micro- twice as many MAPKK genes in B. rapa in compari- array data of Arabidopsis under wound, drought, son with the other species (Fig. 3). Similarly, WGDs genotoxic, cold, osmotic, high-salinity, heat, oxidative and maybe also conducive to the MAPKK gene diversifica- UV-B treatment were collected (Fig. 6d). Many plant tion in G. max (Fig. 3;). Jiang and Chu BMC Genomics (2018) 19:407 Page 12 of 18 Fig. 6 Expression and function analyses of Arabidopsis MKK genes. (a) Expression of Arabidopsis MAPKK genes. X-axis indicates representative tissues and developmental stages and Y-axis represents RPKM (reads per kilobase of mRNA length per million of mapped reads) value. Gene pairs resulted from recent duplications are shown in color. Duplicate genes in same color are paralogs from a recent duplication. (b) Expression of AtMKK10 in wild-type Columbia and ag-12, ap1–15, ap2–6, ap3–6, clv3–7, lfy-12 and ufo-1 mutants. The x-axis indicates different samples and y-axis indicates RPKM value. (c) A multiple sequence alignments of the key phosphorylation site of the MKK10 and other Arabidopsis Group A of MAPKK proteins. (d) Tiling array data of Arabidopsis MKK genes under drought, cold, high-salinity, heat, genotoxic, osmotic, oxidative and wounding and UV-B light treatment Nuclear transfer factor 2 integration into MKK3 occurred in nuclear trafficking . However, the overexpression alone in several plant lineages of NTF2 also hampers the nuclear import of Ran protein The NTF2 protein interacts specifically with the FxFG in Arabidopsis . Our data also showed that the repeat-containing nucleoporins and Ran-GDP to form NTF2 domain exists in 44 MKK3 genes from 38 plant complexes  resulting in the cytoplasmic-nuclear traf- species as early in algae and in mosses, pteridophyta as ficking of Ran . The yeast (Saccharomyces cerevisiae) well as seed plants, including previously reported A. and Caenorhabditis elegans NTF2 orthology plays cru- thaliana (Fig. 7a;). Then sequence alignment of cial roles and ntf2 mutants have significant imperfection NTF2 domain regions of all identified MKK3 exhibited a Jiang and Chu BMC Genomics (2018) 19:407 Page 13 of 18 Fig. 7 Evolutionary and structural analyses of the MKK3 fused NTF2 domain. (a) Visual representation of distribution of NTF2 fusions across plants. (b )Multiple sequence alignment logo of the NTF2 domains found in all MAPKK3 shows conserved core structural aspartic and glutamic residues that have been shown to be essential for binging the RanGDP. (c) Three-dimensional structural model of the NTF2 domain of an Arabidopsis MKK3 At5G40440 (aa 366–516) modeled after the best structural match, S. cerevisiae NTF2-like domain Mex67:Mtr2 (PDB: 4WWU). Conservation profile across all plant NTF2 found in MKK3 is overlapped on the structure, with critical aspartic and glutamic residues depicted in yellow. (d) Maximum likelihood phylogeny of all Arabidopsis NTF2 domain-containing proteins (black nodes) and the NTF2 detected as fusions in all plants (red nodes). Distinct Arabidopsis NTF2 clades that form fusions are highlighted. The names of species are given as in Additional file 1. Each duplicate gene pair was assigned a color and a line on the right critical functional Ran-binding site such as aspartic and Apparently, some fusion proteins were more common in glutamic residues (Fig. 7b and c). Our results also re- non-seed plants (i.e. NTF2a/b), while others were not ex- vealed that the MKK-NTF fusions took place alone in tend to all phyla which may be involved in gene expansion several plants, including both eudicots and monocotyle- in monocots and dicots (Fig. 7d). It has been suggested dons. In order to explore the evolutionary relationship that most NTF2 domain in MKK3 have all the signa- all MKK3 fused NTF2 domain together with all NTF2 tures of the functional NTF2, gradual loss of activity in proteins from Arabidopsis, we performed the phylogen- the nuclear transport cannot be performed function as etic analyses which showed that fusions occurred at least the MKK3 harboring NTF2 domain exhibit change of three times and referred to homologs of NTF2a/b, the conserved key residues, which may be involved with NTF2–1 and NTF2–2 (Fig. 7d). In addition, the group B the gene function more and more sophisticated in the MKKs with NTF2 domain had a recent clades of evolutionary process. Hence, the loss of NTF2 domain lineage-specific paralogs, including one members such leads to sharply weaken the interaction ability of MKK3 clades in soybean, B. distachyon and S. italica, respectively, . It is interesting that the Chlamydomonas genome revealing numerous gene duplication events (Fig. 7d). contains a MKK3 which possess the 3’-NTF domain, Jiang and Chu BMC Genomics (2018) 19:407 Page 14 of 18 suggesting that this delusional arrangement has got duplications. These data revealed novel viewpoints about through a long and successful evolutionary history in the function and evolution history of plant MKKs. the photosynthetic eukaryotes lineage . Methods Conservation of MAPKK protein kinase activities in spite Identification of MAPKK gene family members of mutation Mitogen-activated protein kinase kinase (MAPKK) gene Although mutations have happened in some genes, new information from the model plant A. thaliana and rice and inactive MKK genes were by no means generated as were downloaded from The Arabidopsis Information Re- expected previous reports indicated that a mutation of sources database (TAIR: http://www.arabidopsis.org/) MKKs in serine or theronine residues gave rise to the  and the TIGR rice Genome Annotation Resources abolishment of these kinase activity . However, de- database (http://rice.plantbiology.msu.edu/), re- pending on the public tiling array data and mutant phe- spectively. To identify MAPKK genes of unknown spe- notypes, apparently, the function of AtMKK10 that is cies, BLASTP searches was conducted using orthologous catalytically inactive is associated with pollen develop- protein sequences A. thaliana and O. sativa MAPKKs as ment despite the sequence change in phosphorylation the query search  in the publicly available PLAZA site (Fig. 6b, c). According to our RT-qPCR experi- database (http://bioinformatics.psb.ugent.be/plaza/) ments, BdMKK10–5 have same situation in its phos- and phytozome database (http://www.phytozome.net/) phorylation site is up-regulated after UV-B treatment . The coding and genomic sequences of MAPKK (Additional file 20). More interesting, previous studies genes were collected in 51 plant species (Table 1). The had shown that the average root growth rate for gene was only considered as probable MAPKK gene if it double mutant MKK10/MPK3 or MKK10/MPK6 was harbored the serine/threonine−/dual specificity protein higher than anyone single mutant of them , and kinase domain, including the active site motif D(L/I/V)K also the transcripts of MKK10 significantly changed and the phosphorylation site S/T-X -S/T within the acti- after PPV-infected protoplasts . More studied have vation loop, which was subsequently confirmed by scan- been reported showed that part mutant MAPKK gene ning in InterPro software for the presence of MKK’s was involved with disease resistance in rice and conserved domain . All data were checked for re- ethylene-dependent cell death in maize, respectively, dundancy and no any alternative splice variants were which can phosphorylate MAPK6 and MAPK3 in vivo considered. [57, 58]. Furthermore, AtMKK10 have experienced mutationsinphosphorylation keyactivesiteand ex- Gene structure, sequence alignment, structural modeling hibited different expression profiles from AtMKK7/8/ and protein motif analyses 9(Fig. 6a). So, the mutant paralogs might still main- The exon/intron structure of individual MKK gene was tain the kinase activity to downstream target proteins. confirmed by the Gene Structure Display Server (GSDS) software (http://gsds.cbi.pku.edu.cn/). Multiple sequence Conclusion alignment of the identified amino acid sequences of A total of 365 MKK genes were retrieved from 51 plant MAPKK genes were performed by the Clustal Omega species using bioinformatics approaches based on the (http://www.ebi.ac.uk/Tools/msa/clustalo/). The align- presence of a conserved S/T-X -S/T domain. Our phylo- ment logos of the protein conserved domain were gener- genetic analyses revealed that group A and B MKKs first ated with WebLogo (http://weblogo.threeplusone.com/). appeared in the common ancestors of all green plants, The conserved domains and motifs in the MAPKKs while group C and D MKKs were the rear to arise along were predicted using InterProScan against protein data- with appearance of land or seed plants, respectively, with base (http://www.ebi.ac.uk/interpro/). The schematic of subsequent divergence of Group E MKKs in flowering the structure of all members of MAPKKs were per- plants. It was noteworthy that Group B MKKs were very formed according to the InterProScan analysis results. different compared to those of other groups in many as- Subcellular localization predictions of each MAPKK pects of structure, including exon/intron organizations, were carried out using CELLO server (http://cello.li- phase pattern and NTF2 motif in the MKKs, as well as fe.nctu.edu.tw/). The theoretical pI (isoelectric the biological functions of NTF2 domain with MKK3 point) and Mw (molecular weight) of MAPKKs were will be gradually lost the activity during the evolution performed using Compute pI/Mw tool online (http:// despite the loss of this domain maybe affected inter- web.expasy.org/compute_pi/). Structural modelling of action capability of oneself. The group A MKKs ex- the NTF2 domain was carried out with Phyre2 using panded during the evolution, through WGDs followed amino acid sequence of the NTF2 domain from by diversifications, while group E MKKs expanded dur- At5G40440 (aa 366–516) and the best structure (highest ing monocots evolution through the ancient tandem percent identity, most sequence coverage) modelled after Jiang and Chu BMC Genomics (2018) 19:407 Page 15 of 18 S. cerevisiae NTF2-like domain Mex67:Mtr2 (PDB: species, that is paralogs, the second number is followed by 4WWU) was picked as a template . ahyphen to distinguish between paralogs. Molecular evolution analysis Expression analysis The aligned cDNA sequences were used to calculate the Expression data of Arabidopsis MAPKKs were obtained value of Ka and Ks as well as their ratios using DNASP from PLEXdb (AT40), including following Arabidopsis v5.10 . To investigatethegeneticdivergencebe- tissues: inflorescence, petals, stamen, pedicels, sepals, tween each group, we counted the genetic distances on carpels, pollen, seedling, root, leaf, shoot and pistil. As the basis of the amino acid sequences with the well as following Arabidopsis mutants: ap1–15, ap2–6, Jones-Taylor-Thornton (JTT) model in the Molecular ap3–6, ag-12, clv3–7, lfy-12 and ufo-1. Mapped reads Evolution Genetic Analysis 6.0 (MEGA 6.0) . The were uniquely applied to deeper analysis. Gene expres- overall mean distances of all MAPKKs were measured sion levels were quantified by RPKM (reads per kilobase using MEGA6 software. The MEGA file used to con- of mRNA length per million of mapped reads). Expres- struct phylogenetic tree were also devoted to calculate sion data of MAPKK genes in rice (GSE27726), poplar the overall mean distance of plant MAPKKs. (GSE30507) and maize (GSE27004) were also down- loaded from NCBI GEO database. Microarray data of Synteny and phylogenetic analyses Arabidopsis under abiotic treatment were also obtained Phylogeny of all plant species we surveyed in this study was from NCBI GEO data (GSE5620-GSE5628) in our ana- performed using PhyloT program (http://phylot.biobyte.de/ lysis. For RT-qPCR analyses, 2-week-old B. distachyon ) with the NCBI taxonomy IDs for each species and visual- Bd21 plants grown in each box were used for harvesting ized with iTOL program. The phylogenetic trees were con- root, stem, leaf, young caryopsis and seedling samples. structed based on the maximum-likelihood (ML) method For UV-B treatments analysis, two-week-old B. distach- with a JTT model and the bootstrap test was carried out yon Bd21 plants were treated as described previously with 2000 iterations by the MEGA 6.0 software . To . Plants were harvested for further analyses after insure the more divergent domains could be conducive to treatment 1 h, 3 h, 6 h, 12 h and 24 h, respectively. The the topology of the ML tree, all positions with 95% site primer-sets were listed in Additional file 21. coverage were eliminated. All the phylogenetic trees were deposited in Treebase (http://purl.org/phylo/treebase/phy- lows/study/TB2:S21358). The whole set of MAPKK protein Additional files sequences were assigned to different orthology groups Additional file 1: Table S1. Table showing nomenclature gene name, using OrthoMCL  after Blastp all vs all (e-value 1e − 10) locus ID, detailed genomic information and subcellular localization of (http://orthomcl.org/orthomcl/). Synteny analyses of dupli- plant MAPKKs. (PDF 1109 kb) cate gene pairs and the WGD data were achieved from the Additional file 2: Table S2. OrthoMCL automatic analysis about plant Plant Genome Duplication Database (PGDD; (http://chib- MAPKK. (TIF 40445 kb) ba.agtec.uga.edu/duplication/index/locus). MAPKKs Additional file 3: Fig. S1. The exon/intron structures of plant MAPKK genes. (TIF 2182 kb) we surveyed were produced by segmental duplication of Additional file 4: Table S3. Table representing molecular mass (in kDa) the plant genome were determined by the analysis of the and isoelectric point of different MAPKK genes from 51 plant species output file (Additional file 12)of the CoGe SynMap pro- identified during this study. (XLSX 53 kb) gram (https://genomevolution.org/coge/SynMap.pl) ob- Additional file 5: Table S4. Table shows average amino acid tained using a default parameter . composition of plant MAPKKs. (TIF 2947 kb) Additional file 6: Fig. S2. Weblogos represent the Nucleotide binding domain and the ATP binding site of each group. The stars indicate Nomenclature of MAPKKs residues of functional or structural importance. (TIF 3413 kb) In order to match the names with their function, all pre- Additional file 7: Fig. S3. Weblogos represent the docking site (D-site) dicted MAPKKs were named using the orthologous based of each group. The stars indicate residues of functional or structural importance. (TIF 17233 kb) on the evolutionary relationship of MAPKKs with A. Additional file 8: Table S5. Molecular evolutionary analysis of the thaliana or O. sativa MKKs as suggested by Hamel et al. MAPKK genes in different motif. (TIF 22709 kb) . In the nomenclature systems, the first letter in upper Additional file 9: Table S6. The duplicated gene pairs in the 51 plant case represented the corresponding genus name, the spe- genomes. (TIF 21703 kb) cies name was shown in the second lowercase letter (in a Additional file 10: Fig. S4. Maximum Likelihood phylogenetic trees of few cases the first 1–2 letters), after which the MAPKK plant group A MAPKKs. The red circle represents duplication events. (TIF 13036 kb) and number of corresponding orthologs of O. sativa or Additional file 11: Fig. S5. Syntenic proofs of Group A MAPKKs in Arabidopsis with human and two yeast lineages, respect- Brassicaceae . (TIF 15972 kb) ively. If more than one ortholog existed in individual Jiang and Chu BMC Genomics (2018) 19:407 Page 16 of 18 Competing interests Additional file 12: Table S7. List of duplicated genes assigned to The authors declare that they have no competing interests. segmental duplication in plant genome, as detected by CoGe SynMap analysis. (XLSX 26 kb) Additional file 13: Fig. S6. Maximum Likelihood phylogenetic trees Publisher’sNote of plant group E MAPKKs. The red circle represents duplication Springer Nature remains neutral with regard to jurisdictional claims in events. (TIF 3291 kb) published maps and institutional affiliations. Additional file 14: Fig. S7. Syntenic proofs of Group E MAPKKs in monocots. (DOC 51 kb) Received: 20 October 2017 Accepted: 14 May 2018 Additional file 15: Fig. S8. Maximum Likelihood phylogenetic trees of plant group B MAPKKs. (PDF 1039 kb) Additional file 16: Fig. S9. Maximum Likelihood phylogenetic trees of References 1. Xiong LM, Schumaker KS, Zhu JK. Cell signaling during cold, drought, and plant group C MAPKKs. The red circle represents duplication events. (PDF salt stress. Plant Cell. 2002;14:S165–83. 685 kb) 2. Besteiro MAG, Ulm R. Phosphorylation and stabilization of Arabidopsis MAP Additional file 17: Fig. S10. Maximum Likelihood phylogenetic trees kinase phosphatase 1 in response to UV-B stress. J Biol Chem. 2013;288(1): of plant group D MAPKKs. The red circle represents duplication 480–6. events. (PDF 148 kb) 3. Xu J, Zhang SQ. Mitogen-activated protein kinase cascades in signaling Additional file 18: Fig. S11. Syntenic proofs of plant Group C MAPKKs. plant growth and development. Trends Plant Sci. 2015;20(1):56–64. (TIF 2692 kb) 4. Liu YK. Roles of mitogen-activated protein kinase cascades in ABA signaling. Plant Cell Rep. 2012;31(1):1–12. Additional file 19: Fig. S12. Expression profiles of other plant MAPKK 5. Rodriguez MCS, Petersen M, Mundy J. Mitogen-activated protein kinase genes. Y-axis represents RPKM value. The expression data were down- signaling in plants. Annu Rev Plant Biol. 2010;61:621–49. loaded from rice (GSE27726), poplar (GSE30507) and maize (GSE27004) oligonucleotide array database, respectively. (TIF 7561 kb) 6. Zhang X, Xu X, Yu Y, Chen C, Wang J, Cai C, Guo W. Integration analysis of MKK and MAPK family members highlights potential MAPK signaling Additional file 20: Fig. S13. Expression profiles of B. distachyon MAPKK modules in cotton. Sci Rep-Uk. 2016;6:29781. genes in different tissues and UV-B signaling. (DOCX 15 kb) 7. Whitmarsh AJ, Davis RJ. Structural organization of MAP-kinase signaling Additional file 21: Table S8. A lists of the RT-qPCR primers for the modules by scaffold proteins in yeast and mammals. Trends Biochem Sci. MAPKK genes of B. distachyon. (XLSX 16 kb) 1998;23(12):481–5. 8. Bardwell AJ, Flatauer LJ, Matsukuma K, Thorner J, Bardwell L. A conserved docking site in MEKs mediates high-affinity binding to MAP kinases and Abbreviations cooperates with a scaffold protein to enhance signal transmission. J Biol D-site: Docking site; GSDS: Gene Structure Display Server; JTT: Jones-Taylor- Chem. 2001;276(13):10374–86. Thornton; MAPK: Mitogen-activated protein kinase; MAPKK (MKK): Mitogen- 9. Takekawa M, Tatebayashi K, Saito H. Conserved docking site is essential for activated protein kinase kinase.; MAPKKK (MEKK): MAPKK kinase.; activation of mammalian MAP kinase kinases by specific MAP kinase kinase ML: Maximum likelihood.; Mw: Molecular weights.; NB domian: Nucleotide kinases. Mol Cell. 2005;18(3):295–306. binding domain; NTF2: Nuclear transfer factor 2.; PGDD: Plant Genome 10. Hamel LP, Nicole MC, Sritubtim S, Morency MJ, Ellis M, Ehlting J, Beaudoin Duplication Database.; pI: isoelectric point.; RPKM: Reads per kilobase per N, Barbazuk B, Klessig D, Lee J, et al. Ancient signals: comparative genomics million mapped reads.; RT-qPCR: Real-time quantitative polymerase chain of plant MAPK and MAPKK gene families. Trends Plant Sci. 2006;11(4):192–8. reaction.; WGDs: Whole genome duplications.; WGT: Whole-genome 11. MAPK-Group. Mitogen-activated protein kinase cascades in plants: a new triplication. nomenclature. Trends Plant Sci. 2002;7(7):301–8. 12. Rao KP, Richa T, Kumar K, Raghuram B, Sinha AK. In silico analysis reveals 75 members of mitogen-activated protein kinase kinase kinase gene family in Acknowledgements rice. DNA Res. 2010;17(3):139–53. We want to thank the contributors of the plant Genome Database, which 13. Liang W, Yang B, Yu BJ, Zhou Z, Li C, Jia M, Sun Y, Zhang Y, Wu F, Zhang H, was a convenient tool used to search for plant MAPKK genes. et al. Identification and analysis of MKK and MPK gene families in canola (Brassica napus L.). BMC Genomics. 2013;14:392. Funding 14. Zhang S, Xu R, Luo X, Jiang Z, Shu H. Genome-wide identification and All work was funded by the Grant from Chinese Academy of Sciences expression analysis of MAPK and MAPKK gene family in Malus domestica. strategic resource service network planning of plant germplasm resources Gene. 2013;531(2):377–87. innovation project ZSZC-013 and Grant from Shanghai Landscaping 15. Jiang M, Wen F, Cao JM, Li P, She J, Chu ZQ. Genome-wide exploration of Administrative Bureau (Grant Nos.G162406 and G162407). These funding the molecular evolution and regulatory network of mitogen-activated bodies had no role in study design, analysis, decision to publish, or protein kinase cascades upon multiple stresses in Brachypodium distachyon. preparation of the manuscript. BMC Genomics. 2015;16:228. 16. Cardinale F, Meskiene I, Ouaked F, Hirt H. Convergence and divergence of stress-induced mitogen-activated protein kinase signaling pathways at the Availability of data and materials level of two distinct mitogen-activated protein kinase kinases. Plant Cell. All data generated or analyzed during this study is included in this published 2002;14(3):703–11. article and its supplementary information files. 17. Gao MH, Liu JM, Bi DL, Zhang ZB, Cheng F, Chen SF, Zhang YL. MEKK1, MKK1/MKK2 and MPK4 function together in a mitogen-activated protein Authors’ contributions kinase cascade to regulate innate immunity in plants. Cell Res. 2008; MJ and ZC conceived and designed the project. MJ executed the 18(12):1190–8. bioinformatics analysis, conducted the experiments and wrote the 18. Xing Y, Jia WS, Zhangl JH. AtMKK1 mediates ABA-induced CAT1 expression manuscript. All authors read and approved the final manuscript. and H O production via AtMPK6-coupled signaling in Arabidopsis. Plant J. 2 2 2008;54(3):440–51. Ethics approval and consent to participate 19. Melikant B, Giuliani C, Halbmayer-Watzina S, Limmongkon A, Heberle-Bors E, The B. distachyon Bd21 seeds used in this work were supplied by Wilson C. The Arabidopsis thaliana MEK AtMKK6 activates the MAP kinase International Brachypodium Initiative (USDA-ARS Western Regional Research AtMPK13. FEBS Lett. 2004;576(1–2):5–8. Center, USA). Plants were grown in the growth room for sample collection. 20. Soyano T, Nishihama R, Morikiyo K, Ishikawa M, Machida Y. NQK1/NtMEK1 is The research conducted in this study required neither approval from an a MAPKK that acts in the NPK1 MAPKKK-mediated MAPK cascade and is ethics committee, nor involved any human or animal subjects. required for plant cytokinesis. Genes Dev. 2003;17(8):1055–67. Jiang and Chu BMC Genomics (2018) 19:407 Page 17 of 18 21. Steggerda SM, Paschal BM. Regulation of nuclear import and export by the regulatory loop between the CLAVATA and WUSCHEL genes. Cell. 2000; GTPase ran. Int Rev Cytol. 2002;217:41–91. 100(6):635–44. 22. Takahashi F, Yoshida R, Ichimura K, Mizoguchi T, Seo S, Yonezawa M, 45. DeYoung BJ, Bickle KL, Schrage KJ, Muskett P, Patel K, Clark SE. The Maruyama K, Yamaguchi-Shinozaki K, Shinozaki K. The mitogen-activated CLAVATA1-related BAM1, BAM2 and BAM3 receptor kinase-like proteins are protein kinase cascade MKK3-MPK6 is an important part of the jasmonate required for meristem function in Arabidopsis. Plant J. 2006;45(1):1–16. signal transduction pathway in Arabidopsis. Plant Cell. 2007;19(3):805–18. 46. Zhang LS, Ma H. Complex evolutionary history and diverse domain 23. Sethi V, Raghuram B, Sinha AK, Chattopadhyay S. A mitogen-activated organization of SET proteins suggest divergent regulatory interactions. The protein kinase cascade module, MKK3-MPK6 and MYC2, is involved in blue New Phytologist. 2012;195(1):248–63. light-mediated seedling development in Arabidopsis. Plant Cell. 2014;26(8): 47. Li Q, Zhang N, Zhang L, Ma H. Differential evolution of members of the 3343–57. rhomboid gene family with conservative and divergent patterns. The New 24. Asai T, Tena G, Plotnikova J, Willmann MR, Chiu WL, Gomez-Gomez L, Boller phytologist. 2015;206(1):368–80. T, Ausubel FM, Sheen J. MAP kinase signalling cascade in Arabidopsis innate 48. Jiao YN, Leebens-Mack J, Ayyampalayam S, Bowers JE, McKain MR, McNeal J, immunity. Nature. 2002;415(6875):977–83. Rolf M, Ruzicka DR, Wafula E, Wickett NJ, et al. A genome triplication 25. Li K, Yang FB, Zhang GZ, Song SF, Li Y, Ren DT, Miao YC, Song CP. AIK1, a associated with early diversification of the core eudicots. Genome Biol. mitogen-activated protein kinase, modulates abscisic acid responses through 2012;13(1):R3. the MKK5-MPK6 kinase cascade. Plant Physiol. 2017;173(2):1391–408. 49. Neupane A, Nepal MP, Piya S, Subramanian S, Rohila JS, Reese RN, Benson 26. Yoo SD, Cho YH, Tena G, Xiong Y, Sheen J. Dual control of nuclear EIN3 by BV. Identification, nomenclature, and evolutionary relationships of mitogen- bifurcate MAPK cascades in C2H4 signalling. Nature. 2008;451(7180):789– activated protein kinase (MAPK) genes in soybean. Evol Bioinforma. 2013;9: U781. 363–86. 27. Dai Y, Wang HZ, Li BH, Huang J, Liu XF, Zhou YH, Mou ZL, Li JY. Increased 50. Clarkson WD, Kent HM, Stewart M. Separate binding sites on nuclear expression of MAP KINASE KINASE7 causes deficiency in polar auxin transport factor 2 (NTF2) for GDP-ran and the phenylalanine-rich repeat transport and leads to plant architectural abnormality in Arabidopsis. Plant regions of nucleoporins p62 and Nsp1p. J Mol Biol. 1996;263(4):517–24. Cell. 2006;18(2):308–20. 51. Steggerda SM, Black BE, Paschal BM. Monoclonal antibodies to NTF2 inhibit 28. Ueno Y, Yoshida R, Kishi-Kaboshi M, Matsushita A, Jiang CJ, Goto S, nuclear protein import by preventing nuclear translocation of the GTPase Takahashi A, Hirochika H, Takatsuji H. Abiotic stresses antagonize the rice ran. Mol Biol Cell. 2000;11(2):703–19. defence pathway through the tyrosine-dephosphorylation of OsMPK6. PLoS 52. He HJ, Wang Q, Zheng WW, Wang JX, Song QS, Zhao XF. Function of Pathog. 2015;11(10):e1005231. nuclear transport factor 2 and ran in the 20E signal transduction pathway in 29. Janitza P, Ullrich KK, Quint M. Toward a comprehensive phylogenetic the cotton bollworm, Helicoverpa armigera. BMC Cell Biol. 2010;11:1. reconstruction of the evolutionary history of mitogen-activated protein 53. Zhao Q, Leung S, Corbett AH, Meier I. Identification and characterization of kinases in the plant kingdom. Front Plant Sci. 2012;3:271. the Arabidopsis orthologs of nuclear transport factor 2, the nuclear import 30. Mohanta TK, Arora PK, Mohanta N, Parida P, Bae H. Identification of new factor of ran. Plant Physiol. 2006;140(3):869–78. members of the MAPK gene family in plants shows diverse conserved 54. Doczi R, Brader G, Pettko-Szandtner A, Rajh I, Djamei A, Pitzschke A, Teige domains and novel activation loop variants. BMC Genomics. 2015;16:58. M, Hirt H. The Arabidopsis mitogen-activated protein kinase kinase MKK3 is 31. Van de Peer Y, Maere S, Meyer A. OPINION the evolutionary significance of upstream of group C mitogen-activated protein kinases and participates in ancient genome duplications. Nat Rev Genet. 2009;10(10):725–32. pathogen signaling. Plant Cell. 2007;19(10):3266–79. 32. Maere S, De Bodt S, Raes J, Casneuf T, Van Montagu M, Kuiper M, Van de 55. Su SH, Krysan PJ. A double-mutant collection targeting MAP kinase related Peer Y. Modeling gene and genome duplications in eukaryotes. P Natl Acad genes in Arabidopsis for studying genetic interactions. Plant J. 2016;88(5): Sci USA. 2005;102(15):5454–9. 867–78. 33. Jiao YN, Wickett NJ, Ayyampalayam S, Chanderbali AS, Landherr L, Ralph PE, 56. Babu M, Griffiths JS, Huang TS, Wang A. Altered gene expression changes in Tomsho LP, Hu Y, Liang HY, Soltis PS, et al. Ancestral polyploidy in seed Arabidopsis leaf tissues and protoplasts in response to plum pox virus plants and angiosperms. Nature. 2011;473(7345):97–U113. infection. BMC Genomics. 2008;9:325. 34. Blanc G, Wolfe KH. Widespread paleopolyploidy in model plant species 57. Ma H, Chen J, Zhang Z, Ma L, Yang Z, Zhang Q, Li X, Xiao J, Wang S. MAPK inferred from age distributions of duplicate genes. Plant Cell. 2004;16(7): kinase 10.2 promotes disease resistance and drought tolerance by 1667–78. activating different MAPKs in rice. Plant J. 2017;92(4):557–70. 35. Tanoue TJ, Nishida E. Molecular recognitions in the MAP kinase cascades. 58. Chang Y, Yang HL, Ren DT, Li Y. Activation of ZmMKK10, a maize mitogen- Cell Signal. 2003;15(5):455–62. activated protein kinase kinase, induces ethylene-dependent cell death. 36. Bardwell L, Shah K. Analysis of mitogen-activated protein kinase activation Plant Sci. 2017;264:129–37. and interactions with regulators and substrates. Methods. 2006;40(3):213–23. 59. Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, 37. Hadiarto T, Nanmori T, Matsuoka D, Iwasaki T, Sato K, Fukami Y, Azuma T, Dreher K, Alexander DL, Garcia-Hernandez M, et al. The Arabidopsis Yasuda T. Activation of Arabidopsis MAPK kinase kinase (AtMEKK1) and information resource (TAIR): improved gene annotation and new tools. induction of AtMEKK1-AtMEK1 pathway by wounding. Planta. 2006;223(4): Nucleic Acids Res. 2012;40:D1202–10. 708–13. 60. Ouyang S, Zhu W, Hamilton J, Lin H, Campbell M, Childs K, Thibaud- 38. Wang J, Pan CT, Wang Y, Ye L, Wu J, Chen LF, Zou T, Lu G. Genome-wide Nissen F, Malek RL, Lee Y, Zheng L, et al. The TIGR Rice genome identification of MAPK, MAPKK, and MAPKKK gene families and annotation resource: improvements and new features. Nucleic Acids transcriptional profiling analysis during development and stress response in Res. 2007;35:D883–7. cucumber. BMC Genomics. 2015;16:386. 61. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. 39. Enslen H, Davis RJ. Regulation of MAP kinases by docking domains. Biol Cell. Gapped BLAST and PSI-BLAST: a new generation of protein database search 2001;93(1–2):5–14. programs. Nucleic Acids Res. 1997;25(17):3389–402. 40. Tang HB, Bowers JE, Wang XY, Ming R, Alam M, Paterson AH. Synteny and 62. Proost S, Van Bel M, Vaneechoutte D, Van de Peer Y, Inze D, Mueller-Roeber collinearity in plant genomes. Science. 2008;320(5875):486–8. B, Vandepoele K. PLAZA 3.0: an access point for plant comparative 41. Cenci A, Guignon V, Roux N, Rouard M. Genomic analysis of NAC transcription genomics. Nucleic Acids Res. 2015;43:D974–81. factors in banana (Musa acuminata) and definition of NAC orthologous groups 63. Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, for monocots and dicots. Plant Mol Biol. 2014;85(1–2):63–80. Dirks W, Hellsten U, Putnam N, et al. Phytozome: a comparative platform for 42. Takahashi Y, Soyano T, Kosetsu K, Sasabe M, Machida Y. HINKEL kinesin, ANP green plant genomics. Nucleic Acids Res. 2012;40:D1178–86. MAPKKKs and MKK6/ANQ MAPKK, which phosphorylates and activates 64. de Castro E, Sigrist CJ, Gattiker A, Bulliard V, Langendijk-Genevaux PS, MPK4 MAPK, constitute a pathway that is required for cytokinesis in Gasteiger E, Bairoch A, Hulo N. ScanProsite: detection of PROSITE signature Arabidopsis thaliana. Plant & cell physiology. 2010;51(10):1766–76. matches and ProRule-associated functional and structural residues in 43. Lee H. Mitogen-activated protein kinase kinase 3 is required for regulation proteins. Nucleic Acids Res. 2006;34:W362–5. during dark-light transition. Molecules and Cells. 2015;38(7):651–6. 65. Yu CS, Chen YC, Lu CH, Hwang JK. Prediction of protein subcellular 44. Schoof H, Lenhard M, Haecker A, Mayer KFX, Jurgens G, Laux T. The stem localization. Proteins-Structure Function and Bioinformatics. 2006;64(3): cell population of Arabidopsis shoot meristems is maintained by a 643–51. Jiang and Chu BMC Genomics (2018) 19:407 Page 18 of 18 66. Aibara S, Valkov E, Lamers M, Stewart M. Domain organization within the nuclear export factor Mex67:Mtr2 generates an extended mRNA binding surface. Nucleic Acids Res. 2015;43(3):1927–36. 67. Librado P, Rozas J. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25(11):1451–2. 68. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30(12):2725–9. 69. Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13(9):2178–89. 70. Lyons E, Pedersen B, Kane J, Alam M, Ming R, Tang HB, Wang XY, Bowers J, Paterson A, Lisch D, et al. Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar, and grape: CoGe with rosids. Plant Physiol. 2008;148(4):1772–81. 71. Tong HY, Leasure CD, Hou XW, Yuen G, Briggs W, He ZH. Role of root UV-B sensing in Arabidopsis early seedling development. P Natl Acad Sci USA. 2008;105(52):21039–44. 72. Naihao Ye, Xiaowen Zhang, Miao Miao, Xiao Fan, Yi Zheng, Dong Xu, Jinfeng Wang, Lin Zhou, Dongsheng Wang, Yuan Gao, Yitao Wang, Wenyu Shi, Peifeng Ji, Demao Li, Zheng Guan, Changwei Shao, Zhimeng Zhuang, Zhengquan Gao, Ji Qi, Fangqing Zhao. Saccharina genomes provide novel insight into kelp biology. Nature Communications. 2015;6:6986.
– Springer Journals
Published: May 29, 2018