Abstract Temperatures are expected to increase over the next century in all terrestrial biomes and particularly in boreal forests, where drought-induced mortality has been predicted to rise. Genomics research is helping to develop hypotheses regarding the molecular basis of drought tolerance and recent work proposed that the osmo-protecting dehydrin proteins have undergone a clade-specific expansion in the Pinaceae, a major group of conifer trees. The objectives of this study were to identify all of the putative members of the gene family, trace their evolutionary origin, examine their structural diversity and test for drought-responsive expression. We identified 41 complete dehydrin coding sequences in Picea glauca, which is four times more than most angiosperms studied to date, and more than in pines. Phylogenetic reconstructions indicated that the family has undergone an expansion in conifers, with parallel evolution implicating the sporadic resurgence of certain amino acid sequence motifs, and a major duplication giving rise to a clade specific to the Pinaceae. A variety of plant dehydrin structures were identified with variable numbers of the A-, E-, S- and K-segments and an N-terminal (N1) amino acid motif including assemblages specific to conifers. The expression of several of the spruce dehydrins was tissue preferential under non-stressful conditions or responded to water stress after 7–18 days without watering, reflecting changes in osmotic potential. We found that dehydrins with N1 K2 and N1 AESK2 sequences were the most responsive to the lack of water. Together, the family expansion, drought-responsive expression and structural diversification involving loss and gain of amino acid motifs suggests that subfunctionalization has driven the diversification seen among dehydrin gene duplicates. Our findings clearly indicate that dehydrins represent a large family of candidate genes for drought tolerance in spruces and in other Pinaceae that may underpin adaptability in spatially and temporally variable environments. Introduction A major factor driving the evolution and diversification of vascular plants is the adaptation to water availability (Micco and Aronne 2012). To cope with stresses such as drought, heat, freezing and salinity that affect water relations, plants have developed mechanisms to prevent the loss of intracellular water (Farooq et al. 2012) including the synthesis of dehydrin proteins, which are believed to have dehydration protective functions (Hanin et al. 2011). Dehydrin proteins have a modular structure comprised of a variable number of conserved motifs. The K-segment is a lysine-rich sequence motif that is common to all of the dehydrins described to date, with the exception of a unique dehydrin in maritime pine (Perdiguero et al. 2014). The S-segment is defined by five to seven consecutive serine residues is also conserved in many of the family members. Other motifs are lineage-specific and include the Y-segment that is characterized by tyrosine residues and only found in angiosperms (Campbell and Close 1997), and the A- and E-segments exclusive to conifers and characterized by the presence of alanine and glutamine residues, respectively (Perdiguero et al. 2012). Based on their motif composition, dehydrins have been classified as Kn, SKn, YnSKn, YnKn (Close 1996), EnSKn and AnEnSKn (Perdiguero et al. 2012). Dehydrin proteins may accumulate in maturing seeds or be induced in vegetative tissues in response to salinity, dehydration, cold stress and frost (Close 1996, Tunnacliffe and Wise 2007). The expression of dehydrins varies in different tissues and according to the type and intensity of stress. For example, in grapevine, Dhn1 was not expressed in vegetative tissues under normal conditions but was induced by drought, cold, heat and during embryogenesis. In contrast, Dhn3 was expressed to a low level during seed development and not responsive to stress treatments (Yang et al. 2012). In Norway spruce (Picea abies), Dhn1 and Dhn6 were highly expressed in bark and leaves during drought stress while the others dehydrins were poorly induced or not responsive (Eldhuset et al. 2012). Dehydrins from different sub-classes are upregulated by abiotic stresses including cold and/or salt and/or desiccation but no clear relationship has been observed between the structural classification and the stress responsiveness profile. For example, the sequences YnSKn DHR18 and Kn XERO2 from mouse-ear cress, and SKn EuglDhn2 from Eucalyptus, are upregulated in response to cold stress, but only DHR18 and XERO2 are induced by salt stress and DHR18 and EuglDhn2 by drought (Hundertmark and Hincha 2008, Fernández et al. 2012). The size of the dehydrin gene family is variable among different species. In angiosperms, it can range from two members as in the primitive Amborella (Pfam 30.0, Finn et al. 2014) to more than 12 in apple (Malus domestica) (Liang et al. 2012). In contrast, the Pinaceae appear to have significantly more dehydrin genes although only a few species have been investigated in any detail (Joosen et al. 2006, Yakovlev et al. 2008, Perdiguero et al. 2012, 2014, Kjellsen et al. 2013). A total of 53 distinct dehydrin genes have been identified in the white spruce (Picea glauca (Moench) Voss) transcriptome database (Rigault et al. 2011), which is many more than in herbaceous angiosperms studied to date (Hundertmark and Hincha 2008, Liang et al. 2012, Liu et al. 2012). As a basis to understand the role and evolution of dehydrin genes in conifers and evaluate their involvement in water stress responses, we aimed to: (i) assess the extent of the dehydrin gene family in white spruce and its expansion in the Pinaceae by analyzing full-length gene sequences; (ii) trace the evolutionary origin of dehydrins in both the Pinaceae and angiosperms by studying phylogenetic relationships; (iii) classify gene sequences based on conserved amino acids motifs such as the A-, E-, S- and K-segments; (iv) evaluate the overall role of dehydrins in water stress response by profiling changes in water potential and gene expression in trees exposed to water limited conditions; and (v) assess tissue specificity of dehydrin expression. Materials and methods Dehydrin sequences Expressed dehydrin gene sequences were identified in white spruce (Rigault et al. 2011) by using the HMMER software (v 3.0) (Johnson et al. 2010) and the Pfam database, release 27.0 (Finn et al. 2014) in white spruce (Rigault et al. 2011) and, by sequence similarity searches (BLASTp; Altschul et al. 1990) with published conifer dehydrins. We extended the incomplete white spruce dehydrin cDNA sequences by using RNA-seq data from Verta et al. (2016). The cDNA sequences were then translated into amino acids and the integrity of the open reading frames (ORF) was verified by Blastp (e-value 1e–10) using complete dehydrin sequences from conifers and other plants. We named the dehydrins as suggested by Richard et al. (2000), in which the white spruce dehydrin (PgDhn1) was first described. In this present study, dehydrins were named from PgDhn2 to PgDhn41. Sequence accession numbers are presented in Table S1 available as Supplementary Data at Tree Physiology Online. We searched for conifer dehydrin proteins in the non-redundant protein database (nr) using Blastp, e-value threshold of 1e−20, and white spruce ORFs as query. We retained conifer dehydrins with at least 70% of amino acid sequence similarity and coverage over a minimum of 80 amino acids with the white spruce ORFs. Angiosperm dehydrins were identified based on phmmer (e-value <0.01) (https://www.ebi.ac.uk), which uses profile hidden Markov models and provides a more accurate and sensitive detection of remote homologs than BLAST. In this analysis, we used the white spruce ORF sequences to search in UniProt database (The UniProt Consortium 2015) for Arabidopsis thaliana, Malus domestica, Eucalyptus grandis, Eucalyptus globulus, Prunus persica, Prunus dulcis, Zea mays, Amborella trichopoda, Populus trichocarpa, Vitis vinifera, Oryza sativa and Physcomitrella patens dehydrin proteins. The proteins were verified as dehydrins by hmmerscan (https://www.ebi.ac.uk) using the Pfam database. Phylogenetic analysis The sequences were selected by using the cd-hit program (Li and Godzik 2006) to separately cluster the conifer and angiosperm dehydrin sequences based on a 97% similarity threshold at the amino acid sequence level. Each cluster was represented by one sequence (see Table S2 available as Supplementary Data at Tree Physiology Online). We then aligned the representative sequences using MAFFT version 7.0 and FTT-NS-I (iterative refinement method; 1000 iterations) strategy (Katoh and Standley 2013) found in Geneious R6 (http://www.geneious.com, Kearse et al. 2012). Phylogenetic trees were constructed following a Bayesian framework using MrBayes 3.2.1 (Ronquist et al. 2012). Half a million generations of the Markov chain Monte Carlo (MCMC) method using four chains sampling every 10 generations were completed using the Wag model (Whelan and Goldman 2001), with gamma-distributed rate variation across sites and a proportion of invariable sites. A dehydrin (A9RQA9) of the moss P. patens was used as outgroup. The standard deviation of split frequencies was <0.05 after 485,000 generations. The first 25% of the recovered topologies were discarded. We calculated the consensus tree with Bayesian posterior probability equal or superior to 0.75 and the resulting samples of best-fit trees. Trees were visualized with FigTree v1.4.2 (http://tree.bio.ed.ac.uk). We also constructed the phylogenetic trees utilizing the Maximum Likelihood approach implemented in MEGA6 (Tamura et al. 2013), and also using the WAG model and gamma-distributed rates among sites. We obtained similar but less resolved topologies when compared with results from MrBayes analysis. Thus, only results from MrBayes analysis will be shown. Identification of conserved amino acid motifs and classification of dehydrins The identification of amino acid motifs was performed by using MEME version 4.9.0 (Multiple Expectation Maximization for motif Elicitation) (Bailey et al. 2015) with the following parameters: distribution of motif occurrences was any number of repetitions, maximum number of motifs was 10 and motif width between 6 and 20 amino acids. Motif scanning was performed by MAST (MEME suite, Bailey et al. 2015) and then, sequences were classified among the possible groups: Kn, SKn, YnSKn, YnKn, EnSKn, AnEnSKn and new groups described in the current study. We verified the motifs in the multiple sequence alignments and also identified the degenerate motifs. Plant material Tissue-preferential analysis We surveyed tissue differences in RNA transcript accumulation of in young foliage, shoot secondary xylem, phelloderm and from an previous experiment as described (Raherison et al. 2015). Three biological replicates were each made up of a pool of samples from five genetically distinct young white spruce trees grown under non-limiting conditions in a glasshouse with natural light (Raherison et al. 2015). The samples were collected at 6 a.m., immediately frozen in liquid nitrogen and stored at −80 °C until RNA isolation. Drought experiment We used young trees from three genetically unrelated Picea glauca genotypes (clones 8, 11 and 95) that were propagated in vitro by somatic embryogenesis and grown in containers for 2 years at the Vegetative Propagation Centre of the Saint-Modeste tree nursery of the Quebec Ministry of Forests, Wildlife and Parks of Québec (Saint-Modeste, Québec, Canada). The plants were 40 cm on average, were potted in pots of 5 l containing a mix of peat, perlite and vermiculite (3:1:1, by weight) and grown in a greenhouse with day temperature of 23 °C, night temperature of 20 °C, 16/8 h (day/night) photoperiod and watered three times per week, for 2 months prior to the experiment. For the experiment, one half of the plants were watered and for the other half, water was withheld; the plants were arranged in a completely randomized design. Plants were destructively sampled and the newly formed foliage (needles) was collected at 0, 7, 14, 18 and 22 days from the beginning of the watering treatments. Five plants per genotype (replicates) in both watering treatments were sampled at each sampling point (a total of 150 plants). The watered plants were sampled 2 h after the last watering. The sampling time was at midday. At each sampling day, the midday water potential (branch) of four plants per genotype in both watering regimes was measured using a Scholander pressure chamber (Model 610, PMS Instruments, Albany, OR, USA). Foliage samples were frozen in liquid nitrogen immediately after removal from the trees and stored at −80 °C. The needles were ground to powder using a MixerMill 300 (Retsch, http://www.retsch.com/) and steel grinding balls cooled in nitrogen. Powdered foliage tissue was stored at −80 °C until RNA extraction. RNA sequencing in six Pinaceae species RNA-sequencing was carried in six Pinaceae species including white spruce (P. glauca), black spruce (Picea mariana), estern white pine (Pinus strobus), jack pine (Pinus banksiana), balsam fir (Abies balsamea) and tamarack (Larix laricina). We isolated several different tissue samples including four to six vegetative tissues (for details, see Table S3 available as Supplementary Data at Tree Physiology Online) from 4-year-old plants grown under greenhouse conditions with natural light; we sampled the tissues from two well-watered plants and two plants that had not been watered for 14 days. The tissues also included several megagametophytes and whole embryos isolated from germinating seeds. RNA extraction and cDNA synthesis Total RNA was extracted by grinding tissues in liquid nitrogen to a fine powder and by utilizing either (i) the cetyltrimethyl ammonium bromide (CTAB) extraction method as described by Chang et al. (1993) with modifications (Pavy et al. 2008) for the tissue preferential analysis and the drought experiment or (ii) MasterPure™ Plant RNA Purification kit (Epicenter, Madison, WI, USA) for the RNA-Squencing. The total RNA concentration was determined using a NanoDrop 1000 (Thermo Scientific, http://www.thermoscientific.com/) and assessed for quality with an Agilent 2100 Bioanalyzer and Agilent RNA 6000 Nano Kit LabChips (Agilent Technologies Inc., http://www.agilent.com/), and stored at −80 °C. Complementary DNAs were prepared from 500 ng of total RNA using Quantitect Reverse Transcription Kit (Qiagen, Germantown, MD, USA) and then diluted 1:4 in RNase-free water. RNA sequencing RNA-Seq library synthesis Total RNAs from all of the tissues were combined to form a single RNA pool for each as shown (see Table S3 available as Supplementary Data at Tree Physiology Online). We used 500 ng of total RNA to synthetize mRNA libraries with TruSeq® Stranded mRNA kit (Illumina Canada Inc., Victoria, BC, Canada), following the manufacturer’s protocol with a few modifications. First, Illumina technology-suitable custom adapters were synthetized (IDT, Coralville, IA, USA) and their concentration was reduced by half at 25 nM per reaction to avoid adapter dimer molecules and Tris-NaCl (10 mM, 50 mM) at 25 nM was used to complete the volume. Second strand synthesis and marking clean-up and library amplification clean-up were performed using Axygen® AxyPrep™ Mag PCR Clean-Up Kit (Axygen Biosciences, Union City, CA, USA) whereas post ligation clean-up was done using a ratio of 0.85 of PEG/NaCl SPRI® Solution over Beads with adapter-ligated DNA. Quantifications of each library were determined by a Nanodrop ND-1000 (Thermo Scientific, Wilmington, DE, USA) and library fragment size distribution and the absence of dimer molecules were verified with an Agilent Bioanalyzer 2100 using High Sensitivity DNA chips (Agilent Technologies Inc., Santa Clara, CA, USA). Four equimolar pools of 17 to 18 libraries each were prepared and purified following the Library Amplification Cleanup protocol as described in the library construction protocol. The concentration and the fragment size distribution of each purified equimolar pool was evaluated with a Nanodrop ND-1000 and an Agilent Bioanalyzer 2100, respectively. Each of the pools were sequenced by using the rapid run procedure (2 × 250 bp) at the Genome Quebec Innovation Centre at McGill University (Montreal, Quebec, Canada) with an Illumina HiSeq 2500 sequencing system. Search for dehydrin in Pinaceae transcriptome sequences Dehydrin transcriptome sequences were constructed for each of the Pinaceae species using the NUCLEAR (version 3.2.4) and VISION (version 2.6.12) software (Gydle). First, raw RNA-Seq sequence (FASTQ) files were filtered into High-Quality (HQ) sequences, trimming adapters and retaining only segments of 50+ consecutive Q20+ bases. The HQ sequences were then used to create transcriptome sequences by combining reference-initiated read mapping, iterative sequence extension, de novo assembly of unmapped reads, interactive edition and visualization, and resolution of representative sequences aimed at containing all exons for a single gene. We identified ORFs in the Pinaceae transcriptome sequences using getorf (EMBOSS 6.6.0) and searched for dehydrin sequences presenting a structure of N1 Kn, by using BLASTP (version 2.2.26+) with white spruce Kn ORFs as queries and retaining sequences with an e-value threshold of 1e–10 and covering at least 70% of white spruce Kn ORFs over at least 80 amino acids. We then utilized MAST (MEME suite, Bailey et al. 2015) and the conifer motifs to classify all retained ORFs. Gene specific amplification of RNA transcripts by reverse-transcription PCR A primer pair was designed for each of the dehydrin sequences using the Primer3Plus software (http://www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi) (see Table S4 available as Supplementary Data at Tree Physiology Online). The self-complementarity of the designed primers was verified using Oligo Calc (Oligonucleotide Properties Calculator software,http://www.basic.northwestern.edu/biotools/oligocalc.html) and specificity was verified against the P. glauca gene catalogue (Rigault et al. 2011) and the extended dehydrin cDNA sequences. We used gene-specific dehydrin primers (see Table S4 available as Supplementary Data at Tree Physiology Online) to determine RNA transcript levels from drought stress and the tissue comparison experiments (Raherison et al. 2015) by using quantitative reverse transcription (RT)-PCR (qPCR). We used the QuantiFast® SYBR® Green PCR kit (Qiagen) as follows: 1× master mix, 300 nM of 5′ and 3′ primers and 5 μl of cDNA (5 ng) in a final volume of 15 μl. Amplifications were carried out in a LightCycler® 480 (Roche, http://www.roche.com/) as described in Boyle et al. (2009). We used the LRE method (Rutledge and Stewart 2008) adapted for Excel (Boyle et al. 2009) to estimate the number of transcript molecules, which was normalized by the geometric mean of three reference genes: elongation factor 1a (EF1-α) (BT102965), cell division cycle 2 (CDC2) (BT106071) and ribosomal protein L3A (BT115036) as described in Beaulieu et al. (2013). PCR products were sequenced with the Sanger method to verify primer specificity. Statistical analysis of RT-qPCR data The transcript abundance data (as per normalized number of molecules) determined by RT-qPCR for each dehydrin was transformed to log2 scale for subsequent statistical analysis. We used the R software for statistical computing and the construction of graphs (package ggplot2) (http://www.r-project.org; Wickham 2009). For the drought experiment, the transcript abundance data of each gene was analyzed separately using analyses of variance (ANOVA) as a function of the type of treatment (different watering regimes simulating different water potential), genotypes, sampling dates and their interactions. If significant differences were detected, a multiple comparison test (Tukey’s honest significant difference, HSD) was performed. For the tissue preferential expression, we evaluated gene expression differences, we performed gene-specific ANOVA with expression as a function of tissues. If significant differences were detected, a multiple comparison test was performed (Tukey’s HSD). Results Identification of abundant dehydrins sequences in spruce This study was initiated based on 53 cDNA dehydrin sequences that were identified in the white spruce gene catalog (Rigault et al. 2011, Raherison et al. 2012). In total, 41 of the sequences contained an ORF that we estimated as complete based on sequence alignments. We used sequence similarity and HMM searches to find sequences in other species. We identified many sequences in the Pinaceae, i.e., 108 amino acid sequences in the genus Pinus and additional amino acid sequences 36 in Picea with at least 70% similarity and coverage with the previously discovered white spruce dehydrins. Our search of dehydrins also extended to both monocots and dicots in order to represent the diversity of dehydrins in angiosperms as well as major conifers. We identified a total of 76 dehydrins distributed in mouse-ear cress (A. thaliana), apple (M. domestica), Eucalyptus (E. grandis and E. globulus), peach (P. persica), almond (P. dulcis), maize (Z. mays), Amborella (A. trichopoda), poplar (P. trichocarpa), grape (V. vinifera) and rice (O. sativa). Many of the sequences within the angiosperms and within the Pinaceae were very similar, therefore we clustered the sequences that were 97% identical or more. The Pinaceae sequences thus formed 78 distinct clusters (see Table S2 available as Supplementary Data at Tree Physiology Online) whereas the angiosperms formed 57 clusters (see Table S5 available as Supplementary Data at Tree Physiology Online), suggesting a larger number and greater diversity of sequences in the Pinaceae than in angiosperms, as previously suggested (Rigault et al. 2011). The representation of different taxa in the clusters strongly supported this hypothesis as the individual conifer genera were represented in more clusters, i.e., 56 clusters for Picea and 21 clusters for Pinus whereas major angiosperms were represented in only 2–10 clusters. The data showed that dehydrin sequences were particularly abundant in Picea, but this could also be the effect of a more complete reporting of such sequences in Picea than in other conifers. Dehydrins are highly divergent between angiosperms and Pinaceae We used one representative sequence from each cluster to construct the phylogenetic trees and carried out an exhaustive de novo search for amino acid motifs in the dehydrin sequences. Five of the motifs had previously been identified in plant dehydrins (Campbell and Close 1996, Zolotarov and Stromvik 2015). They include the K-segment and S-segment common to both angiosperms and gymnosperms, the A-segment and E-segment found only in Pinaceae (Perdiguero et al. 2012), and the Y-segment found only in angiosperms. Here, we also identified a conserved N1 motif located at the N-terminal region of sequences (see Figure S1 available as Supplementary Data at Tree Physiology Online). In the present study, the A-segment was found to be less conserved than described (Perdiguero et al. 2012), which we explain by the fact that we used a total of 78 Pinaceae dehydrins in the motif-identification process. This result impacts on the classification of dehydrins. In a previous study, the maritime pine dehydrin Ppter_dhn_ESK2 (accession number K7ZW95, cluster 40, see Table S2 available as Supplementary Data at Tree Physiology Online) was reported to lack an A-segment (Perdiguero et al. 2012) but we found that it contains a less conserved A-segment. We began by constructing separate phylogenetic trees with the dehydrin sequences from angiosperms (see Figure S2 available as Supplementary Data at Tree Physiology Online) and from the Pinaceae (Figure 1). The Pinaceae tree showed two distinct groups with different amino acid motif structures, i.e., N1 AnEnSKn and N1 Kn (Figure 1, Groups 1 and 2, respectively); the angiosperm tree was split into two main groups, with characteristic N1 SKn and YnSKn amino acid motif structures (see Figure S2 available as Supplementary Data at Tree Physiology Online, Groups 3 and 4, respectively). Next, we constructed a combined tree with angiosperm and Pinaceae sequences together that revealed the same two Pinaceae groups (1 and 2) and the two disjunct angiosperm groups (3 and 4); some of the branches indicating further phylogenetic structuring had weak statistical support and for this reason, we estimated a consensus tree with an a posteriori probability threshold support equal or superior to 0.75 (Figure 2). Figure 1. View large Download slide Phylogeny of the conifer dehydrin gene family represented by a consensus tree from Bayesian analysis, with threshold support equal or superior to 0.75. We used 41 white spruce dehydrins and 37 other conifer sequences; see details of sequence clusters in Table S2 available as Supplementary Data at Tree Physiology Online. A dehydrin from Physcomitrella was used as the root. The phylogeny was obtained with MrBayes after protein alignment with MAFFT, and visualized with FigTree. Picea taxon are indicated with red dots and Pinus taxon with green dots. Figure 1. View large Download slide Phylogeny of the conifer dehydrin gene family represented by a consensus tree from Bayesian analysis, with threshold support equal or superior to 0.75. We used 41 white spruce dehydrins and 37 other conifer sequences; see details of sequence clusters in Table S2 available as Supplementary Data at Tree Physiology Online. A dehydrin from Physcomitrella was used as the root. The phylogeny was obtained with MrBayes after protein alignment with MAFFT, and visualized with FigTree. Picea taxon are indicated with red dots and Pinus taxon with green dots. Figure 2. View largeDownload slide Phylogeny of the angiosperm and conifer dehydrin gene family represented by a consensus tree from Bayesian analysis, with threshold support equal or superior to 0.75. A dehydrin from the moss Physcomitrella was used to root the tree. The phylogeny was obtained with MrBayes after protein alignment with MAFFT. Angiosperms are indicated with blue dots, Picea taxon with red dots and Pinus taxon with green dots. Figure 2. View largeDownload slide Phylogeny of the angiosperm and conifer dehydrin gene family represented by a consensus tree from Bayesian analysis, with threshold support equal or superior to 0.75. A dehydrin from the moss Physcomitrella was used to root the tree. The phylogeny was obtained with MrBayes after protein alignment with MAFFT. Angiosperms are indicated with blue dots, Picea taxon with red dots and Pinus taxon with green dots. Within the conifers, two sub-groups were observed for both the N1 Kn and Kn structures (shades of violet in Figure 2, Group 2), suggesting parallel loss of amino acid motifs. We further observed the presence of few isolated Kn dehydrins in the Groups 1 and 4 (Group 4) (Figure 2, N1 AnEnSKn and YnSKn, respectively), also suggesting parallel evolution. A major duplication is detected within the Pinaceae The Pinaceae phylogenetic tree revealed diversification patterns supported by high a posteriori probabilities and this led to the hypothesis of a major duplication event leading to the accumulation of (N1) Kn (Group 2) sequences. The data indicated that the duplication may have occurred within the lineage giving rise to the genus Picea, after the split between Picea and Pinus (Figure 1). The most parsimonious interpretation of the tree is that this event then gave rise to the Kn group, which is specific to Picea, and that this group underwent several further duplications generating the several groups of dehydrins in white spruce. We aimed to test the above hypothesis and sample other transcriptomes from the Pinaceae; therefore, we carried out RNA sequencing in white spruce (P. glauca), black spruce (P. mariana), estern white pine (P. strobus), jack pine (P. banksiana), balsam fir (A. balsamea) and tamarack larch (L. laricina) (Table 1). We searched for (N1) Kn sequences and found Group 2 sequences in all of the species tested and identified many more of the sequences in Picea spp. (12–18) than in Pinus spp. (3–6); we identified also several sequences in two other species belong to other genera within the Pinaceae, A. balsamea (17 sequences) and L. laricina (16 sequences). The data indicate that the duplication is not specific to Picea spp. as was suggested by our phylogenetic tree. Table 1. Identification of Kn dehydrin sequences (Group 1 and 2) in other members of the Pinaceae. Sequences were identified by using HMMER searches in de novo transcriptome assemblies obtained from RNA sequencing of multiple tissues from well-watered, water stressed trees and germinating seedlings (Blastp e-value <1e−10). Species Number of Kn dehydrins Group 1 Group 2 Abies balsamea 0 17 Larix laricina 0 16 Picea glauca 0 18 Picea mariana 3 12 Pinus banksiana 7 3 Pinus strobus 5 6 Species Number of Kn dehydrins Group 1 Group 2 Abies balsamea 0 17 Larix laricina 0 16 Picea glauca 0 18 Picea mariana 3 12 Pinus banksiana 7 3 Pinus strobus 5 6 Degenerate K-segments and structural variations in conifer dehydrins Based on the amino acid motifs and in congruence with results from phylogenetic analyses, all 135 representative dehydrins from the Pinaceae and angiosperms were classified into four major amino acid structures: (1) N1 AnEnSKn with variations in the presence of A and E-segments; (2) N1 Kn with some variations in the presence of N1 segment; (3) N1 SKn; and (4) YnSKn. Almost all of the dehydrins contained one or multiple K-segments with the exception of one sequence in maritime pine (P. pinaster) (model 10, Figure 3, Table S6 available as Supplementary Data at Tree Physiology Online) as previously reported (Perdiguero et al. 2014). Figure 3. View largeDownload slide Conifer and angiosperm dehydrins classification based on their amino-acid motifs. (A) Sequences were grouped by similarity and classified by motif composition. (B) Each dehydrin type was represented showing the variation in number of motifs. Figure 3. View largeDownload slide Conifer and angiosperm dehydrins classification based on their amino-acid motifs. (A) Sequences were grouped by similarity and classified by motif composition. (B) Each dehydrin type was represented showing the variation in number of motifs. The K-segment was degenerate (P-value < 1.1e–5) in many conifer dehydrins such as PgDhn8 and PgDhn9, among others (see Table S6 available as Supplementary Data at Tree Physiology Online), and was more highly conserved in angiosperms (Figure 3 and Figure S1 available as Supplementary Data at Tree Physiology Online). In some cases, the K-segment was degenerated to the point of not being identified by motif sequence similarity; in these cases, we used the sequence alignment to identify these K-segments. In the group N1 AnEnSKn, we identified 12 structural variations, some of which have not been described before. For example, PgDhn8 and PgDhn9 (models 9 and 11, Figure 3), which lack an E-segment. In the Kn group in spruces we observed a high number (>6) of K-segments not previously reported in other conifers. We observed that sequence similarity and amino acid motif structure were not always congruent, i.e., some sequences grouped together in the phylogenetic tree although they differed in their amino acid motif classification (Figure 3). This could be explained by the considerable variation in amino acid sequence that exists within the motifs (see Figure S1 available as Supplementary Data at Tree Physiology Online). Similarly, dehydrins that were not regrouped tightly on the phylogenetic tree may share the same motif structure. For instance, the model 1 from maritime pine (N1 K2) was not grouped with other N1 K2 sequences such as PgDhn33 and PgDhn34 (model 25). Similarly, angiosperm Kn dehydrins were not grouped with conifer Kn dehydrins (models 18 and 19), and maritime pine sequences from model 12, classified as SK, were not grouped with the angiosperm sequences SKn. Dehydrin expression varies between different tissues and conditions We designed gene-specific primer pairs for the white spruce dehydrins with complete ORFs to evaluate their RNA accumulation profile by RT-qPCR (see Table S4 available as Supplementary Data at Tree Physiology Online). Given the large number of sequences and high levels of similarity, the assay specificity was verified by preliminary tests in which amplicons were sequenced for validation. Next, we surveyed RNA transcript levels in three different tissues, phelloderm, xylem and foliage from plants growing under non-limiting conditions. Reliable transcript detection was recorded for 13 of the white spruce dehydrins in at least one of the tissues (Figure 4). Amplifications that lacked specificity were eliminated from the analyses and dehydrin sequences producing no detectable product were considered to have specificity to other tissues or to other biological conditions. Figure 4. View largeDownload slide Transcript accumulation profiles from F (foliage), X (xylem) and P (phelloderm) measured by qPCR. Significant differences between tissue expression levels are indicated on the right side, ANOVA, Tukey’s HSD (P < 0.05; ns indicates no significant difference between the expression level among the three tissues). Figure 4. View largeDownload slide Transcript accumulation profiles from F (foliage), X (xylem) and P (phelloderm) measured by qPCR. Significant differences between tissue expression levels are indicated on the right side, ANOVA, Tukey’s HSD (P < 0.05; ns indicates no significant difference between the expression level among the three tissues). The gene-by-gene analysis of variance with expression data as a function of the type of tissue identified nine genes with differential expression (see Table S7 available as Supplementary Data at Tree Physiology Online). The multiple comparison tests showed that five of the sequences produced preferentially expressed transcripts in the phelloderm, four in the foliage and two in the secondary xylem, and the four remaining sequences did not vary between tissues (Figure 4). Members of the dehydrin family respond differently to water stress We conducted a greenhouse experiment with three white spruce genotypes comparing dehydrin transcript accumulation profiles in well-watered and non-watered plants at several time points. Starting from 14 days of treatment, statistically significant differences in water potential were detected between the well-watered and non-watered plants (Figure 5) and the response was similar among the three genotypes (see Table S8 available as Supplementary Data at Tree Physiology Online). Figure 5. View largeDownload slide Midday water potential in needles of well-watered plants (dashed line) and unwatered plants (solid line) in three different genotypes (clones 8, 11 and 95). The water potential of water-stressed plants was compared with that of control plants for each sampling date in all three genotypes (ANOVA, Tukey’s HSD, ***P < 0.001). Figure 5. View largeDownload slide Midday water potential in needles of well-watered plants (dashed line) and unwatered plants (solid line) in three different genotypes (clones 8, 11 and 95). The water potential of water-stressed plants was compared with that of control plants for each sampling date in all three genotypes (ANOVA, Tukey’s HSD, ***P < 0.001). We were able to reliably detect transcript abundance in the foliage for 10 of the dehydrins in well-watered and non-watered plants. Gene-by-gene analysis of variance followed by multiple comparison tests showed that, under the same conditions, there was no statistical difference among the expression pattern of the three clones, except for the gene PgDhn10 where after 7 days under water stress, a significant difference in gene expression level between clones 11 and 95 was detected (see Table S9 available as Supplementary Data at Tree Physiology Online). Eight genes had statistically different expression levels between watered and non-watered plants at least for one sampling date. The genes PgDhn10, PgDhn16, PgDhn33 and PgDhn35 showed a remarkable increase of gene expression in water-stressed plants compared with well-watered plants. Their increased transcript levels were statistically significant starting at Day 14, which coincides with the changes in water potential. The genes PgDhn7, PgDhn9 and PgDhn12 showed a slight increase in expression in stressed plants compared with watered plants after 18–22 days of treatment. Only the gene PgDhn36 showed a decrease in expression, which was slight and was observed only after 22 days without watering (Figure 6). Figure 6. View largeDownload slide Expression profile of dehydrin genes during 22 days of treatment. The gene expression of water-stressed plants (solid lines) was compared with that of control plants (dashed lines) for each sampling date in all three genotypes (clones 8, 11 and 95). ANOVA, Tukey test, *P < 0.05; **P < 0.01; ***P < 0.001. Figure 6. View largeDownload slide Expression profile of dehydrin genes during 22 days of treatment. The gene expression of water-stressed plants (solid lines) was compared with that of control plants (dashed lines) for each sampling date in all three genotypes (clones 8, 11 and 95). ANOVA, Tukey test, *P < 0.05; **P < 0.01; ***P < 0.001. We examined the transcript profiles by comparing the two major classes of dehydrins found in spruces, N1 AnEnSKn and N1 Kn. Among the six genes that had increased transcript levels in response to water stress, four were of the N1 AnEnSKn type (N1 AESK2, N1 A2E2SK4, N1 ASK3, N1 ASK2) and two were of the N1 Kn type (N1 K2, K4). The genes with slightly decreasing and those with no response to water stress, were classified as K1 (two sequences) and K6. Taken together, these classifications indicate that diverse dehydrin sequences are water-stress responsive and that neither of the two major classes appears to have a clearly characteristic profile; however, the number of genes assayed only represents 27% of the white spruce dehydrin sequences identified. Discussion We identified and classified a large number of dehydrins and the results indicate that they form a large gene family in the Pinaceae and are particularly abundant in spruces. The Pinaceae dehydrins appeared structurally diverse and notable differences were observed when compared with angiosperm dehydrins, including poorly conserved K-segments and conifer-specific segments. We identified nine dehydrins with differential tissue expression under normal conditions and eight dehydrins that respond to drought stress. The most strongly induced dehydrins were classified as Kn type. Below, we discuss these results against a backdrop of speciation and adaptation to abiotic factors. Structural diversity in dehydrin protein sequences We identified a large number of dehydrins detected in Picea but did not find any new conserved amino acid motifs. All of the white spruce dehydrins presented at least one K-segment and many of them contained A- and E-segments together, as previously found in maritime pine (Perdiguero et al. 2012). Here, we showed that the A-segment may be less conserved than previously reported (Perdiguero et al. 2012). In contrast to angiosperms, no Y-segments were identified in the Pinaceae (Campbell and Close 1997, Zolotarov and Stromvik 2015). The previously reported conserved N-terminal sequence (N1) had not been considered as a bona fide protein motif (Perdiguero et al. 2012) but we have included it in our structural classification of protein sequence. We classified all of the dehydrins represented in the phylogenetic trees based on the amino acid motifs (see Figure S1 available as Supplementary Data at Tree Physiology Online) (Figure 3). The angiosperm and Pinaceae sequences were classified among 37 different models showing a wide diversity of dehydrin protein structures varying in the composition and number of motifs (Figure 3). White spruce dehydrins were classified mainly as AnEnSKn and (N1) Kn types. We observed a variation in the number of A-, E- and K-segments, as was also found in other Pinaceae (Perdiguero et al. 2012). We report that Pinaceae dehydrin genes may contain more than six K-segments, including PgDhn21, PgDhn22 and PgDhn23 (see Table S6 available as Supplementary Data at Tree Physiology Online), and that dehydrins may be classified as AnSK, which harbor the A-segment but lack the E-segment (PgDhn8 and PgDhn9). The structural diversity is present not only between the different structure types but also within each type. Some of the Pinaceae and angiosperm dehydrins were classified as Kn type but their amino acid sequences were highly divergent. We also observed that the amino-acid sequence of the K-segment was variable among the Pinaceae dehydrins, in contrast to angiosperm dehydrins, which contain more conserved K-segment sequences (see Figure S1 available as Supplementary Data at Tree Physiology Online). In vitro assays suggested that the K-segments play a key role in the protective function of dehydrins in preventing deleterious changes in protein secondary and tertiary structure (Reyes et al. 2008). It remains to be elucidated whether the reduced conservation of K-segments that is observed in the Pinaceae has functional implications and whether the protective function is maintained. The S-segment also appears to be important in dehydrin function. It may be important as putative phosphorylation sites (Jensen et al. 1998), being involved in post-translational protein modifications impacting tolerance to drought and salt (Brini et al. 2007). It is also possible that the A- and E-segments found in the Pinaceae and the N1-terminal motifs play important roles in protein conformation and function, but this remains to be tested. The modular organization of dehydrins, the large variation seen in the number and position of the different motifs, and their patterns of expression not connected tightly to structural differences and phylogenetic grouping (see below) is consistent with subfunctionalization following duplication events, as also reported for conifer transcription factors (Guillet-Claude et al. 2004), although further analyses comparing white spruce dehydrins with their paralogs in other species will be needed to test this hypothesis. Dehydrin genes were also found to diverge very rapidly in P. glauca at the nucleotide sequence level, the family showing among the highest ratios of nonsynonymous to synonymous substitutions (A/S) among more than 2000 gene families analyzed (Pavy et al. 2013). These highly positive ratios could be linked to the structural diversity observed in white spruce dehydrins. Gene families with high A/S ratio were also those with the highest heterogeneity of gene expression across white spruce tissues (Pavy et al. 2013), supporting the notion of rapid subfunctionalization with obvious implications for adaptive potential. Dehydrin gene family evolution and expansion in the Pinaceae We first identified 53 dehydrins in the white spruce gene catalog, of which 41 had a complete ORF. Further sequence discovery in RNA-Seq datasets (Verta et al. 2016) and clustering based on sequence similarity suggested that spruces contained up to 56 distinct dehydrin genes (see Table S2 available as Supplementary Data at Tree Physiology Online). This relatively large number of dehydrins exceeds that observed in other plant species. For example, 10 genes were identified in mouse-ear cress and poplar (Hundertmark and Hincha 2008, Liu et al. 2012), 12 were found in apple (Liang et al. 2012), eight in rice (Wang et al. 2007), six in each of peach and maize (Finn et al. 2014, Basset et al. 2015, Pfam 30.0), four in grapevine (Yang et al. 2012) and only two in the primitive A. trichopoda (Pfam 30.0, Finn et al. 2014). Previous studies in other conifers have focused on the Pinaceae, such as P. pinaster, Picea obovata and P. abies, and described up to 10 dehydrins per species (Joosen et al. 2006, Yakovlev et al. 2008, Perdiguero et al. 2012, Kjellsen et al. 2013), and did not perform exhaustive searches of expressed dehydrins. We carried out an exhaustive search for dehydrin homologs in conifers focusing on the Pinaceae and found more dehydrins in both P. glauca and Picea sitchensis (see Table S2 available as Supplementary Data at Tree Physiology Online) than in pine species, suggesting that the Picea genus may have more dehydrins than angiosperms and Pinus. On the other hand, this may be the consequence of sampling effects since full-length cDNA has been more extensively explored in P. glauca and P. sitchensis (Ralph et al. 2008, Rigault et al. 2011) than in Pinus spp. We performed separate and combined phylogenetic analysis of angiosperm and conifer dehydrins (Figures 1 and 2). The dehydrins were distributed into four main groups paralleling structural differences: two conifer groups, with N1 Kn and AnEnSKn amino acid structures (Groups 1 and 2), and two angiosperm groups, with YnSKn and N1 SKn amino acid structures (Groups 3 and 4). The combined angiosperm and conifer phylogenetic tree suggests an interesting evolutionary history for this gene family. The simplest hypothesis is that the most ancestral gene had a structure most similar to the N1 SKn and N1 AnENSKn sister group types, which is reflected by their highest taxonomical representation including conifer, dicot and monocot sequences, and that sequences diverged through gene duplications as well as loss and acquisition of amino acid motifs in a parallel fashion, which occurred largely after the split of angiosperms and gymnosperms, around 300 Mya (Savard et al. 1994). This most parsimonious interpretation assumes there were no major gene losses in the taxa analyzed. In line with this interpretation, a major duplication would have occurred very early in the angiosperm lineage, before the split between monocots and dicots (140–150 Mya) (Chaw et al. 2004), and giving rise to the Yn SKn structure type, which is present along with the N1 SKn type in all angiosperms tested, including monocots and dicots. This new gene duplicate would have lost the N1 motif and acquired an Yn motif. Within the Pinaceae, a duplication gave rise to Group 2 characterized by the Kn and N1 Kn amino acid structure that occurred before the split of Picea and Pinus around 120–140 Mya (Savard et al. 1994) (Figure 2). The topology was not well resolved at the root of the phylogenetic tree for this group, but it suggests that more than one duplication may have occurred where the N1 motif was likely lost. Our phylogenetic analysis and the RNA-sequencing indicate that the Group 2 sequences are much more numerous in Picea spp., Larix and Abies than in Pinus spp., suggesting that differential gene duplications have occurred after the formation of the different genera in the Pinaceae. The presence of the Kn motif in two Pinus pinaster sequences located within the Pinaceae Group 1 suggests parallel evolution, which may indicate that under certain environmental pressures, specific amino acid sequence motifs could re-emerge sporadically. In addition to this major gene duplication and the sporadic resurgence of amino acid sequence motifs, several other duplications have been observed at various stages, most frequently in the Pinaceae. These duplications have clearly impacted the size of the dehydrin gene family, especially in Picea. Taken together, these results suggest that the higher diversification rate of dehydrin genes seen in the Pinaceae, compared with angiosperms, might be related to long-term genetic adaptation to a spatially and temporarily more heterogeneous environment throughout the evolution and diversification of the lineage. Members of the Pinaceae are long-lived and several of them may colonize extreme habitats, as seen for boreal conifers such as white spruce; we may speculate that larger families of key genes related to adaptation could confer more plasticity and survival ability through subfunctionalization. Duplicated genes may result either from whole-genome duplication (WGD) or from more localized segmental or single-gene duplications (Blanc and Wolfe 2004, Cannon et al. 2004). Many WGD events have been detected in angiosperms. For example, Arabidopsis has experienced at least three WGD including an event that was shared by all eudicots (Bowers et al. 2003). Angiosperms have also experienced lineage-specific WGD, some of which were reported in forest trees, for instance in Eucalyptus (Myburg et al. 2014) and in the Salicaceae (Tuskan et al. 2006). The expansion of gene families such as dehydrins in angiosperms is likely to be facilitated by both tandem duplication and WGD events (Wang et al. 2007, Hundertmark and Hincha 2008, Liu et al. 2012). The Pinaceae were recently reported to have experienced two WGD events (Li et al. 2015). One ancient event would have been shared with the Cupressaceae and the Taxaceae (which could not be sampled in the present study) and all of the seed plants including angiosperms, and the other WDG would have occurred in the common ancestor of Pinaceae only. Although the topology of the dehydrin phylogenetic tree lacks resolution near the origin (Figure 2), no clear evidence was seen that could support either of these ancient WGD events. The angiosperm sequences were split into two large groups, which is likely the consequence of a major duplication event at least preceding the monocot–dicot divergence and likely involving WGD. However, the lack of intervening Pinaceae sequences in any of these groups would indicate that the duplication event occurred after the angiosperm–gymnosperm split, or alternatively, that the ancient duplicated copy had been lost in the lineage leading to conifers and the Pinaceae if this event had occurred before the angiosperm–gymnosperm split, as previously reported. Similarly, the Pinaceae sequences were split into two major groups (Figure 2) without any intervening angiosperm sequences. This pattern may support a WGD common to all of the Pinaceae because both Group 1 and Group 2 sequences are represented in the different genera that we analyzed. The tree topology and RNA sequencing results further suggest that more recent duplications were more frequent in the spruce common ancestor after the pine–spruce lineage split. Expression of dehydrin genes in developmental and stress responses Dehydrins are multifunction proteins that accumulate during seed formation and are present in vegetative tissues under normal conditions (Campbell and Close 1997, Bies-Ethève et al. 2008). They have been linked to protective functions (Reyes et al. 2008, Brini et al. 2010), chaperone activity (Kovacs et al. 2008), water-binding capacity (Rinne et al. 1999) and to an antioxidant role (Hara et al. 2004). The expression of many dehydrin genes changes in response to abiotic stress conditions such as drought, salt and cold (Close 1996), as well to biotic stress such as wounding and infection (Richard et al. 2000, Hundertmark and Hincha 2008, Yang et al. 2012). Expression has also linked some dehydrins to growth processes such as spring bud burst in conifers (Yakovlev et al. 2008). We identified nine white spruce dehydrins with differential expression when comparing three different tissues (foliage, secondary xylem and phelloderm) under normal growth conditions. However, these differences were not tightly linked to structural types or phylogenetic groups. Other dehydrins in angiosperms also showed differential expression between tissues types. Arabidopsis dehydrins AtLEA2-5, AtLEA2-6 and AtLEA2-7 were expressed in seeds while AtLEA2-1, AtLEA2-2 and AtLEA2-4 were strongly expressed in vegetative tissues (Bies-Ethève et al. 2008). In apple, five dehydrins were expressed in flowers, seeds, leaves, fruit and roots, and another four in a subset of these tissues (Liang et al. 2012). These observations indicate that different dehydrins may have acquired a degree of tissue specificity and suggests that some dehydrins are important to plant development. Many studies showed that angiosperms' dehydrins were induced by drought, cold stress or both and many of them were classified as YnSKn and grouped together in our phylogenetic analysis (Figure 2). They include between one and five dehydrin genes in rice (Wang et al. 2007), grape (Yang et al. 2012), Arabidopsis (one Kn sequence) (Hundertmark and Hincha 2008), peach (Basset et al. 2015), apple (Liang et al. 2012) and Eucalyptus (Fernández et al. 2012). However, some of the dehydrins in this group, like PpDhn5 in peach and AtLEA2-7 in Arabidopsis, were induced neither by cold nor drought stress (Hundertmark and Hincha 2008, Basset et al. 2015). The other group of angiosperm dehydrins in the phylogenetic tree has the typical N1 SKn structure and includes sequences that are cold responsive (AtLEA2-1 and -2; EuglDhn2) and sequences that had either no detectable transcripts or very low expression under drought and cold stress (Hundertmark and Hincha 2008, Liang et al. 2012, Yang et al. 2012, Basset et al. 2015). Our analysis identified eight dehydrins that responded to water stress in white spruce. Four of them increased their transcript levels several fold after several days without watering. They were classified as N1 K2, K4 and N1 AESK2 (PgDhn10, 16, 33 and 35), indicating that the two main conifer groups in the phylogenetic tree comprise dehydrin sequences that are drought-stress responsive. In Norway spruce and maritime pine, N1 K2 dehydrins had among the highest transcript levels after a period of water stress (Eldhuset et al. 2012, Perdiguero et al. 2012). In maritime pine, a N1 AESK2 dehydrin presented a very similar transcript accumulation pattern (Perdiguero et al. 2012). An interesting observation is that both N1 K2 dehydrins from maritime pine were grouped with the N1AnEnSKn cluster (light blue) in the phylogenetic tree, while N1 K2 dehydrins from white spruce fell in the (N1) Kn group (purple). The maritime pine genes harbored sequences that are more similar to N1 AnEnSKn dehydrins but their structure is closer to that of the N1 K2 cluster, suggesting parallel evolution in which selective forces may have shaped these proteins to carry out the same function. Similar observations of parallel evolution have been made from the identification of different adaptive genes to climatic factors among different Pinaceae taxa, but pertaining to same large gene families (Prunier et al. 2011). The spruce dehydrins PgDhn 7, 9, 12, 17, 36 and 37, including both N1 AnEnSKn and N1 (Kn) types, were less responsive to drought conditions such as reported for N1 AESK in maritime pine (Perdiguero et al. 2012) (Figure 6). Considering the diverse roles attributed to dehydrins, they may be more responsive to other types of stress, as observed in Siberian spruce where Dhn 2 (N1 AESK3) and Dhn Cap1.1 (K6) were induced by cold conditions (Kjellsen et al. 2013). These observations and findings that some dehydrins are more strongly expressed in other organs including roots or stem in response to water stress (Lorenz et al. 2011, Eldhuset et al. 2012, Perdiguero et al. 2012) indicate that our analysis is likely to reveal a partial picture of their whole range of expression in conifers. Conclusions Our results indicate that the dehydrin gene family is larger in the Pinaceae than in angiosperms, and suggest that a major duplication contributed to a lineage-specific expansion. The present results also suggest that subfunctionalization could be a main driver for the increased diversity of dehydrins in conifers, with diversification implicating loss and gain of structural motifs. The dehydrin gene family has been well studied in angiosperms and has been linked to a variety of cellular processes. The diversity of dehydrin sequences, together with their tissue-preferential and drought-responsive expression, suggests that they are involved in a variety of physiological processes in spruce. Further experiments including additional assessments of stress responsiveness will likely be needed to shed more light onto the potential processes in which they are involved. The N1 K2 and N1 AESK2 dehydrins were very responsive to water stress in the Pinaceae. Studies involving diverse genotypes and genetic experiments could reveal the potential of these genes as molecular markers for tolerance to drought. In the next decades, the boreal biome is expected to experience the largest increase in temperatures of all forest biomes and drought-induced mortality is predicted to increase (Gauthier et al. 2015). An improved understanding of the molecular response of conifers to drought will be highly useful to design diagnostic tools to help map and conserve the natural genetic diversity that is relevant for adaptation to drought stress in order to maintain a healthy boreal forest. Supplementary Data Supplementary Data for this article are available at Tree Physiology Online. Acknowledgments The authors thank François Larochelle and Marie-Andrée Paré (both of Université Laval) for assistance with plant materials and the drought stress experiment, Marie R. Coyea (Université Laval) for advice and assistance for the water potential measurements, Stéphane Daigle from Centre for Forest Research for advice on statistical testing, and Elie Raherison, Mebarek Lamara, Benjamin Dufils and Sébastien Caron helped with plant sampling. Conflict of interest The authors declare that they have no competing interests. Funding Funding was received from Génome Québec and Genome Canada for the SmarTForests project (J.B. and J.M.), and from NSERC of Canada for a discovery grant (J.M.). Authors’ contributions J.S.S. planned and executed the drought experiment, lab manipulations, the data analysis and interpretation of results, and drafted the manuscript. I.G. provided technical assistance for RNA analysis, RT-qPCR and DNA cloning. P.R. assembled the Pinaceae gene sequences using RNA-Seq data and previously developed gene catalogues. J.B. and J.M. contributed to the supervision, discussion of the research and revised the manuscript. All of the authors approved the manuscript. References Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ ( 1990) Basic local alignment search tool. J Mol Biol 215: 403– 410. Google Scholar CrossRef Search ADS PubMed Bailey TL, Johnson J, Grant CE, Noble WS ( 2015) The MEME Suite. Nucleic Acids Res 43: W39– W49. Bassett CL, Fisher KM, Farrel RE Jr ( 2015) The complete peach dehydrin family: characterization of three recently recognized genes. Tree Genet Genomes 11: 1– 14. Google Scholar CrossRef Search ADS Beaulieu J, Giguère I, Deslauriers M, Boyle B, MacKay J ( 2013) Differential gene expression patterns in white spruce newly formed tissue on board the International Space Station. Adv Space Res 52: 760– 772. Google Scholar CrossRef Search ADS Bies-Ethève N, Gaubier-Comella P, Debures A, Lasserre E, Jobet E, Raynal M, Cooke R, Delseny M ( 2008) Inventory, evolution and expression profiling diversity of the LEA (late embryogenesis abundant) protein gene family in Arabidopsis thaliana. Plant Mol Biol 67: 107– 124. Google Scholar CrossRef Search ADS PubMed Blanc G, Wolfe KH ( 2004) Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell 16: 1667– 1678. Google Scholar CrossRef Search ADS PubMed Bowers JE, Chapman BA, Rong J, Paterson AH ( 2003) Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422: 433– 438. Google Scholar CrossRef Search ADS PubMed Boyle B, Dallaire N, MacKay J ( 2009) Evaluation of the impact of single nucleotide polymorphisms and primer mismatches on quantitative PCR. BMC Biotechnol 9: 75. Google Scholar CrossRef Search ADS PubMed Brini F, Hanin M, Lumbreras V, Irar S, Pagès M, Masmoudi K ( 2007) Functional characterization of DHN-5, a dehydrin showing a differential phosphorylation pattern in two Tunisian durum wheat (Triticum durum Desf.) varieties with marked differences in salt and drought tolerance. Plant Sci 172: 20– 28. Google Scholar CrossRef Search ADS Brini F, Saibi W, Amara I, Gargouri A, Masmoudi K, Hanin M ( 2010) Wheat dehydrin DHN-5 exerts a heat-protective effect on β-glucosidase and glucose oxidase activities. Biosci Biotechnol Biochem 74: 1050– 1054. Google Scholar CrossRef Search ADS PubMed Campbell SA, Close TJ ( 1997) Dehydrins: genes, proteins, and associations with phenotypic traits. New Phytol 137: 61– 74. Google Scholar CrossRef Search ADS Cannon SB, Mitra A, Baumgarten A, Young ND, May G ( 2004) The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana. BMC Plant Biol 4: 10. Google Scholar CrossRef Search ADS PubMed Chang S, Puryear J, Cairney J ( 1993) A simple and efficient method for isolating RNA from pine trees. Plant Mol Biol Rep 11: 113– 116. Google Scholar CrossRef Search ADS Chaw S-M, Chang C-C, Chen H-L, Li W-H ( 2004) Dating the monocot–dicot divergence and the origin of core eudicots using whole chloroplast genomes. J Mol Evol 58: 424– 441. Google Scholar CrossRef Search ADS PubMed Close TJ ( 1996) Dehydrins: emergence of a biochemical role of a family of plant dehydration proteins. Physiol Plant 97: 795– 803. Google Scholar CrossRef Search ADS Eldhuset TD, Nagy NE, Volařík D, Børja I, Gebauer R, Yakovlev IA, Krokene P ( 2012) Drought affects tracheid structure, dehydrin expression, and above- and belowground growth in 5-year-old Norway spruce. Plant Soil 366: 305– 320. Google Scholar CrossRef Search ADS Farooq M, Hussain M, Wahid A, Siddique KHM ( 2012) Drought stress in plants: an overview. In: Aroca R (ed) Plant responses to drought stress . Springer, Berlin Heidelberg, pp 1– 33. Fernández M, Valenzuela S, Barraza H, Latorre J, Neira V ( 2012) Photoperiod, temperature and water deficit differentially regulate the expression of four dehydrin genes from Eucalyptus globulus. Trees 26: 1483– 1493. Google Scholar CrossRef Search ADS Finn RD, Bateman A, Clements J et al. . ( 2014) Pfam: the protein families database. Nucleic Acids Res 42: D222– D230. Google Scholar CrossRef Search ADS PubMed Gauthier S, Bernier P, Kuuluvainen T, Shvidenko AZ, Schepaschenko DG ( 2015) Boreal forest health and global change. Science 349: 819– 822. Google Scholar CrossRef Search ADS PubMed Guillet-Claude C, Isabel N, Pelgas B, Bousquet J ( 2004) The evolutionary implications of knox-I gene duplications in conifers: correlated evidence from phylogeny, gene mapping, and analysis of functional divergence. Mol Biol Evol 21: 2232– 2245. Google Scholar CrossRef Search ADS PubMed Hanin M, Brini F, Ebel C, Toda Y, Takeda S, Masmoudi K ( 2011) Plant dehydrins and stress tolerance. Plant Signal Behav 6: 1503– 1509. Google Scholar CrossRef Search ADS PubMed Hara M, Fujinaga M, Kuboi T ( 2004) Radical scavenging activity and oxidative modification of citrus dehydrin. Plant Physiol Biochem 42: 657– 662. Google Scholar CrossRef Search ADS PubMed Hundertmark M, Hincha DK ( 2008) LEA (late embryogenesis abundant) proteins and their encoding genes in Arabidopsis thaliana. BMC Genomics 9: 118. Google Scholar CrossRef Search ADS PubMed Jensen AB, Goday A, Figueras M, Jessop AC, Pagès M ( 1998) Phosphorylation mediates the nuclear targeting of the maize Rab17 protein. Plant J 13: 691– 697. Google Scholar CrossRef Search ADS PubMed Johnson LS, Eddy SR, Portugaly E ( 2010) Hidden Markov model speed heuristic and iterative HMM search procedure. BMC Bioinformatics 11: 431. Google Scholar CrossRef Search ADS PubMed Joosen RVL, Lammers M, Balk PA, Brønnum P, Konings MCJM, Perks M, Stattin E, van Wordragen MF, van der Geest AHM ( 2006) Correlating gene expression to physiological parameters and environmental conditions during cold acclimation of Pinus sylvestris, identification of molecular markers using cDNA microarrays. Tree Physiol 26: 1297– 1313. Google Scholar CrossRef Search ADS PubMed Katoh K, Standley DM ( 2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol 30: 772– 780. Google Scholar CrossRef Search ADS PubMed Kearse M, Moir R, Wilson A et al. . ( 2012) Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28: 1647– 1649. Google Scholar CrossRef Search ADS PubMed Kjellsen TD, Yakovlev IA, Fossdal CG, Strimbeck GR ( 2013) Dehydrin accumulation and extreme low-temperature tolerance in Siberian spruce (Picea obovata). Tree Physiol 33: 1354– 1366. Google Scholar CrossRef Search ADS PubMed Kovacs D, Kalmar E, Torok Z, Tompa P ( 2008) Chaperone activity of ERD10 and ERD14, two disordered stress-related plant proteins. Plant Physiol 147: 381– 390. Google Scholar CrossRef Search ADS PubMed Li W, Godzik A ( 2006) Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22: 1658– 1659. Google Scholar CrossRef Search ADS PubMed Li Z, Baniaga AE, Sessa EB, Scascitelli M, Graham SW, Rieseberg LH, Barker MS ( 2015) Early genome duplications in conifers and other seed plants. Sci Adv 1: e1501084. Google Scholar CrossRef Search ADS PubMed Liang D, Xia H, Wu S, Ma F ( 2012) Genome-wide identification and expression profiling of dehydrin gene family in Malus domestica. Mol Biol Rep 39: 10759– 10768. Google Scholar CrossRef Search ADS PubMed Liu C-C, Li C-M, Liu B-G, Ge S-J, Dong X-M, Li W, Zhu H-Y, Wang B-C, Yang C-P ( 2012) Genome-wide identification and characterization of a dehydrin gene family in poplar (Populus trichocarpa). Plant Mol Biol Rep 30: 848– 859. Google Scholar CrossRef Search ADS Lorenz WW, Alba R, Yu Y-S, Bordeaux JM, Simões M, Dean JF ( 2011) Microarray analysis and scale-free gene networks identify candidate regulators in drought-stressed roots of loblolly pine (P. taeda L.). BMC Genomics 12: 264. Google Scholar CrossRef Search ADS PubMed Micco VD, Aronne G ( 2012) Morpho-anatomical traits for plant adaptation to drought. In: Aroca R (ed) Plant responses to drought stress . Springer, Berlin Heidelberg, pp 37– 61. Google Scholar CrossRef Search ADS Myburg AA, Grattapaglia D, Tuskan GA et al. . ( 2014) The genome of Eucalyptus grandis. Nature 510: 356– 362. Google Scholar CrossRef Search ADS PubMed Pavy N, Boyle B, Nelson C et al. . ( 2008) Identification of conserved core xylem gene sets: conifer cDNA microarray development, transcript profiling and computational analyses. New Phytol 180: 766– 786. Google Scholar CrossRef Search ADS PubMed Pavy N, Deschênes A, Blais S, Lavigne P, Beaulieu J, Isabel N, Mackay J, Bousquet J ( 2013) The landscape of nucleotide polymorphism among 13,500 genes of the conifer Picea glauca, relationships with functions, and comparison with Medicago truncatula. Genome Biol Evol 5: 1910– 1925. Google Scholar CrossRef Search ADS PubMed Perdiguero P, Barbero MC, Cervera MT, Soto Á, Collada C ( 2012) Novel conserved segments are associated with differential expression patterns for Pinaceae dehydrins. Planta 236: 1863– 1874. Google Scholar CrossRef Search ADS PubMed Perdiguero P, Collada C, Soto Á ( 2014) Novel dehydrins lacking complete K-segments in Pinaceae. The exception rather than the rule. Front Plant Sci 5: 682. Google Scholar CrossRef Search ADS PubMed Prunier J, Laroche J, Beaulieu J, Bousquet J ( 2011) Scanning the genome for gene SNPs related to climate adaptation and estimating selection at the molecular level in boreal black spruce. Mol Ecol 20: 1702– 1716. Google Scholar CrossRef Search ADS PubMed Raherison E, Rigault P, Caron S et al. . ( 2012) Transcriptome profiling in conifers and the PiceaGenExpress database show patterns of diversification within gene families and interspecific conservation in vascular gene expression. BMC Genomics 13: 434. Google Scholar CrossRef Search ADS PubMed Raherison ESM, Giguère I, Caron S, Lamara M, MacKay JJ ( 2015) Modular organization of the white spruce (Picea glauca) transcriptome reveals functional organization and evolutionary signatures. Phytol 207: 172– 187. Google Scholar CrossRef Search ADS Ralph SG, Chun HJE, Kolosova N et al. . ( 2008) A conifer genomics resource of 200,000 spruce (Picea spp.) ESTs and 6,464 high-quality, sequence-finished full-length cDNAs for Sitka spruce (Picea sitchensis). BMC Genomics 9: 484. Google Scholar CrossRef Search ADS PubMed Reyes JL, Campos F, Wei H, Arora R, Yang Y, Karlson DT, Covarrubias AA ( 2008) Functional dissection of hydrophilins during in vitro freeze protection. Plant Cell Environ 31: 1781– 1790. Google Scholar CrossRef Search ADS PubMed Richard S, Morency M-J, Drevet C, Jouanin L, Séguin A ( 2000) Isolation and characterization of a dehydrin gene from white spruce induced upon wounding, drought and cold stresses. Plant Mol Biol 43: 1– 10. Google Scholar CrossRef Search ADS PubMed Rigault P, Boyle B, Lepage P, Cooke JEK, Bousquet J, MacKay JJ ( 2011) A white spruce gene catalog for conifer genome analyses. Plant Physiol 157: 14– 28. Google Scholar CrossRef Search ADS PubMed Rinne PLH, Kaikuranta PLM, van der Plas LHW, van der Schoot C ( 1999) Dehydrins in cold-acclimated apices of birch (Betula pubescens Ehrh.): production, localization and potential role in rescuing enzyme function during dehydration. Planta 209: 377– 388. Google Scholar CrossRef Search ADS PubMed Ronquist F, Teslenko M, van der Mark P et al. . ( 2012) MrBayes 3.2: efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol 61: 539– 542. Google Scholar CrossRef Search ADS PubMed Rutledge RG, Stewart D ( 2008) A kinetic-based sigmoidal model for the polymerase chain reaction and its application to high-capacity absolute quantitative real-time PCR. BMC Biotechnol 8: 47. Google Scholar CrossRef Search ADS PubMed Savard L, Li P, Strauss SH, Chase MW, Michaud M, Bousquet J ( 1994) Chloroplast and nuclear gene sequences indicate late Pennsylvanian time for the last common ancestor of extant seed plants. Proc Natl Acad Sci USA 91: 5163– 5167. Google Scholar CrossRef Search ADS PubMed Tamura K, Stecher G, Peterson D, Filipski A, Kumar S ( 2013) MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol 30: 2725– 2729. Google Scholar CrossRef Search ADS PubMed The UniProt Consortium ( 2015) UniProt: a hub for protein information. Nucleic Acids Res 43: D204– D212. CrossRef Search ADS PubMed Tunnacliffe A, Wise MJ ( 2007) The continuing conundrum of the LEA proteins. Naturwissenschaften 94: 791– 812. Google Scholar CrossRef Search ADS PubMed Tuskan GA, DiFazio S, Jansson S et al. . ( 2006) The genome of Black Cottonwood, Populus trichocarpa (Torr. & Gray). Science 313: 1596– 1604. Google Scholar CrossRef Search ADS PubMed Verta J-P, Landry CR, MacKay J ( 2016) Dissection of expression-quantitative trait locus and allele specificity using a haploid/diploid plant system – insights into compensatory evolution of transcriptional regulation within populations. Phytol 211: 159– 171. Google Scholar CrossRef Search ADS Wang X-S, Zhu H-B, Jin G-L, Liu H-L, Wu W-R, Zhu J ( 2007) Genome-scale identification and analysis of LEA genes in rice (Oryza sativa L.). Plant Science 172: 414– 420. Google Scholar CrossRef Search ADS Whelan S, Goldman N ( 2001) A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol Biol Evol 18: 691– 699. Google Scholar CrossRef Search ADS PubMed Wickham H ( 2009) ggplot2 . Springer New York, New York, NY. Google Scholar CrossRef Search ADS Yakovlev IA, Asante DKA, Fossdal CG, Partanen J, Junttila O, Johnsen O ( 2008) Dehydrins expression related to timing of bud burst in Norway spruce. Planta 228: 459– 472. Google Scholar CrossRef Search ADS PubMed Yang Y, He M, Zhu Z, Li S, Xu Y, Zhang C, Singer SD, Wang Y ( 2012) Identification of the dehydrin gene family from grapevine species and analysis of their responsiveness to various forms of abiotic and biotic stress. BMC Plant Biol 12: 140. Google Scholar CrossRef Search ADS PubMed Zolotarov Y, Strömvik M ( 2015) De novo regulatory motif discovery identifies significant motifs in promoters of five classes of plant dehydrin genes. PLos ONE 10: e0129016. Google Scholar CrossRef Search ADS PubMed Author notes handling Editor Tsai Chung-Jui © The Author 2017. Published by Oxford University Press. All rights reserved. For Permissions, please email: firstname.lastname@example.org
Tree Physiology – Oxford University Press
Published: Mar 1, 2018
It’s your single place to instantly
discover and read the research
that matters to you.
Enjoy affordable access to
over 18 million articles from more than
15,000 peer-reviewed journals.
All for just $49/month
Query the DeepDyve database, plus search all of PubMed and Google Scholar seamlessly
Save any article or search result from DeepDyve, PubMed, and Google Scholar... all in one place.
All the latest content is available, no embargo periods.
“Whoa! It’s like Spotify but for academic articles.”@Phil_Robichaud