TY - JOUR AU1 - Robinson, Andrew R. AU2 - Gheneim, Rana AU3 - Kozak, Robert A. AU4 - Ellis, Dave D. AU5 - Mansfield, Shawn D. AB - Abstract Differences between wild-type Populus tremula×alba and two transgenic lines with modified lignin monomer composition, were interrogated using metabolic profiling. Analysis of metabolite abundance data by GC-MS, coupled with principal components analysis (PCA), successfully differentiated between lines that had distinct phenotypes, whether samples were taken from the cambial zone or non-lignifying suspension tissue cultures. Interestingly, the GC-MS analysis detected relatively few phenolic metabolites in cambial extracts, although a single metabolite associated with the differentiation between lines was directly related to the phenylpropanoid pathway or other down-stream aspects of lignin biosynthesis. In fact, carbohydrates, which have only an indirect relationship with the modified lignin monomer composition, featured strongly in the line-differentiating aspects of the statistical analysis. Traditional HPLC analysis was employed to verify the GC-MS data. These findings demonstrate that metabolic traits can be dissected reliably and accurately by metabolomic analyses, enabling the discrimination of individual genotypes of the same tree species that exhibit marked differences in industrially relevant wood traits. Furthermore, this validates the potential of using metabolite profiling techniques for marker generation in the context of plant/tree breeding for industrial applications. Ferulate 5-hydroxylase, lignin, metabolite profiling, metabolomics, Populus, suspension cultures Introduction Improvements in plant breeding for the forest industry are reliant on the development of new tools that allow the early selection of trees based on inherent wood quality traits, in addition to more classical attributes such as growth rate and overall biomass yield (volume) (Campbell et al., 2003). The demand for this approach to breeding has arisen because, in many cases, the suitability of wood for specific end-uses is heavily influenced by the inherent physical and chemical attributes that it exhibits. This affects the value of wood in the market place, as well as the efficiency and economic viability of secondary processes that use wood as a feedstock. The aromatic biopolymer, lignin, is a principal structural component in woody tissue, and contributes significantly to vascular integrity and wood strength (Donaldson, 2001). Lignin is formed as one of the major products of the phenylpropanoid pathway, and the mechanisms of its biosynthesis have been the focus of intense research (Dixon et al., 2001; Humphreys and Chapple, 2002; Li et al., 2003). Particular attention has been directed towards the identification of relevant biosynthetic enzymes and corresponding genetic material, as well as understanding the regulation of gene expression (transcription, translation, and enzyme–substrate interactions), and its role in developmental and tissue-specific biosynthesis (Anterola and Lewis, 2002; Rogers and Campbell, 2004). In terms of industry, the abundance and variable nature of lignin influences wood durability, the suitability of wood for manufacturing, and has implications for the use of wood as a feedstock for the production of secondary products such as high-grade paper (Huntley et al., 2003). In the course of secondary xylem biosynthesis, resources are passed through biochemical pathways in order to generate monomeric units, which are subsequently assembled into the constituent polymers (e.g. lignin, cellulose, and hemicellulose). This process involves spatially and temporally controlled enzymatic activity that causes flux through multi-reaction pathways; a component of this may be the pooling of some of the chemical intermediates produced. The nature and inherent variability of the constituents of wood manifests phenotypes that, in some way, must be related to the biosynthetic material from which these polymers are constructed and their assembly process. The specificities of both flux and pooling of biosynthetic materials are presumably representative of the biosynthetic pathway to which they contribute. Given this, patterns in the relative abundance of small molecules (metabolites) participating in cellular metabolism could be effective indicators of phenotypes related to wood quality traits. ‘Metabolomics’ or ‘metabolic profiling’, the measurement and comparison of metabolic traits, is increasingly being employed as a powerful approach to characterize living organisms (Fiehn, 2001), and may also prove useful in the selection of trees in the context of tree improvement programmes. With the advent of routine high-throughput bench-top chromatography-mass spectrometry, the ability to resolve and identify the metabolites in crude tissue extracts has improved dramatically. The utility of these techniques has been effectively demonstrated in the context of metabolite profiling for plant biology (Fiehn et al., 2000b; Roessner et al., 2001b; Frenzel et al., 2002; Tolstikov and Fiehn, 2002; Fiehn, 2003). Metabolic profiling, however, has yet to be developed and applied widely in plant breeding, although such use is inevitable as it is a powerful tool to characterize plant phenotypes. Here, the ability of metabolite profiling to distinguish between the metabolomes of genotypically differentiated lines of the same hybrid tree, expressing different phenotypes that relate to industrially relevant wood chemistry attributes, is evaluated. Due to its unique position in active tree-related functional genomics programmes (Brunner et al., 2004), hybrid poplar was chosen as the tree species, and lignin biosynthesis and its associated impact on cell wall formation as the system to demonstrate the use of metabolic profiling to differentiate desirable phenotypes in trees. Transformation of Populus tremula×alba with a C4H-F5H genetic construct (comprised of the xylem-specific cinnamate 4-hydroxylase (C4H) promoter coupled to the ferulate 5-hydroxylase (F5H) gene (both from Arabidopsis), has been shown to increase significantly the ratio of syringyl (S) to guaiacyl (G) monomers in the lignin of this hybrid (Franke et al., 2000). Increases in the S:G ratio are associated with improved chemical (kraft) pulping efficiency, and as such, have environmental and economic implications for pulp and paper manufacture (Huntley et al., 2003). The results in this study clearly demonstrate the ability of metabolite profiling to differentiate between trees differing in industrially relevant wood quality traits due to this single gene modification. Materials and methods Plant materials and sampling Hybrid poplar-717 (Populus tremula×alba) was selected as the control. In addition, two genetically modified lines that exhibit marked changes in wood chemistry and quality attributes were adopted as treatments. These represent separate transformation events involving the same construct, which consists of the xylem-specific cinnamate 4-hydroxylase (C4H) promoter coupled to the ferulate 5-hydroxylase (F5H) gene (both from Arabidopsis). The C4H-F5H construct has been shown to significantly increase the ratio of syringyl (S) to guaiacyl (G) monomers in poplar lignin, although the severity of the observed phenotype is transformation-event specific (Huntley et al., 2003). The unmodified wild-type has 65.6% mol syringyl content, whereas F5H-82 and F5H-64, have 82.5% mol and 93.4% mol syringyl content, respectively (Huntley et al., 2003). It should be noted that the modified lines, referred to as F5H-82 and F5H-64 in this work and that of Huntley et al. (2003) correspond to those referred to as ‘B’ and ‘I’, respectively, by Franke et al. (2000). At their origin, the control and modified lines were regenerated concurrently and in an equivalent manner, from leaf blade-derived callus after the tissue had undergone Agrobacterium-based transformation. Lines were subsequently maintained as sterile shoot cultures. To generate plant material for this study, shoot cultures were clonally propagated on semi-solid Woody Plant Medium (WPM) (McCown and Lloyd, 1981), supplemented with 0.01 mmol l−1 α-naphthalene acetic acid (NAA), under a 16/8 h light/dark regime. Fluorescent light was supplied at a photon flux density of 50 μmol m−2 s−1. For the generation of test trees, wild-type and F5H-64 plantlets were transferred to soil-based medium upon rooting, and then grown in randomized plots in a greenhouse under a natural light regime. Cambium was sampled in August 2003 mid-way through the third growing season, during daylight hours and in full sunlight. Tissue from the cambial zone was obtained from each tree by first peeling a rectangular section of bark/phloem/outer cambium from approximately 15 cm above the ground on the stem, and then scraping the inner cambium with a fresh razor blade. Care was taken to avoid sampling from nodes. The collected material was quickly isolated and transferred to a cryovial, snap-frozen in liquid nitrogen and stored at −80 °C. Suspension cultures All three lines were propagated as cell suspensions in sterile liquid culture using WPM supplemented with 10 μM 2,4-dichlorophenoxyacetic acid (2,4-D). Cultures were initiated using 1–2 mm internode sections (30–50×) and 10 ml of medium in sterile 50 ml Erlenmeyer flasks. Nodal tissue, which contains meristematic cells, was actively avoided in culture initiation. Each flask was sealed with a foam bung and foil cap, and placed on an orbital shaker at 135 rpm. The light/dark regime was as described for plantlet culture, above. Half of the spent medium was replaced every 7 d until the tissue began to proliferate (2–5 weeks). Following proliferation, 10 ml of fresh medium was added to the culture to give a total culture volume of 20 ml. When subculturing at subsequent weekly intervals, suspensions were first diluted or concentrated so that after a settling period of 30 min, tissue occupied half of the culture volume. 5 ml of this (approximately 2.5 ml packed cell volume) was then transferred to a new flask containing 16 ml fresh medium. Stability, based on uniform growth and morphology, was achieved for all cultures within 2–3 months. For metabolite profiling of stable lines, tissue samples were isolated from the growing medium, quickly transferred to cryovials, snap-frozen in liquid nitrogen and stored at −80 °C. To obtain daily measurements for the growth rate experiment, cultures were allowed to settle in sterile graduated cylinders for 30 min, after which time cell volume data were recorded and cultures were returned to their flasks. Nucleic acid preparation and semi-quantitative RT-PCR Total RNA was extracted from suspension culture tissue using the method of Kolosova et al. (2004). Invitrogen SuperScript II reverse transcriptase was used to synthesize first-strand cDNA, which was then used as the template in a semi-quantitative PCR with the following primers, yielding a 71 bp fragment. Forward primer 5′-CGTTGTCTCTCTTTTCATCTTC-3′, reverse primer 5′-CGTGGACCGGGAGGATATG-3′. PCR products were visualized on an agarose gel using ethidium bromide staining. Metabolite sample preparation Frozen tissue was ground to a fine powder using a dental amalgam mixer, employing a liquid N2-chilled copper/plastic capsule containing three steel ball bearings. The sample was shaken violently for 15 s. Samples were kept frozen at all times and, once ground, were returned to −80 °C. Metabolites were extracted from tissue samples and prepared for GC-MS using a scaled-down and re-optimized version of a two-phase methanol/chloroform method developed for metabolite extraction from the leaves of Arabidopsis (Fiehn et al., 2000b). Approximately 20 mg of frozen, ground cambium was weighed into a prechilled 2 ml lock-cap centrifuge tube (for suspension cultures 50 mg of tissue was used). 600 μl of CH3OH was added immediately and the sample was vortexed for 10 s to halt biological activity and minimize degradation. 40 μl of H2O, 10 μl of a polar internal standard (10 mg ml−1 ribitol in H2O) and 10 μl of a lipophilic internal standard (10 mg ml−1 nonadecanoic acid methyl ester in CHCl3) were added. Metabolites were extracted from the sample by incubation for 15 min at 70 °C with constant agitation, and following a 5 min centrifugation of the sample at 13 000 rpm the supernatant was transferred to a new 2 ml tube. 800 μl of CHCl3 was added to the pellet and vortexed for 10 s to resuspend. The sample was then incubated for 5 min at 35 °C with constant agitation, and the supernatant recovered following a second 5 min centrifugation at 13 000 rpm, and combined with the supernatant from the CH3OH extraction. Following the addition of 600 μl of H2O to the combined supernatant and 10 s vortexing, the mixture was centrifuged at 4000 rpm for 15 min to separate the methanol/water (upper) and methanol/chloroform (lower) phases. In theory, metabolites partition themselves between the two phases depending on which they have greater affinity for – the upper phase being more polar and the lower more lipophilic. A 1 ml aliquot was taken from the upper phase with care, to avoid contamination from the interphase, and stored at −20 °C overnight if not processed immediately. Metabolites contained in the lower phase were not analysed in this study. Samples were then derivatized for GC-MS analysis. 900 μl of the methanol/water phase was dried using a speedvac (3–4 h, low temperature). For the protection of carbonyl moieties by methoxymation, the pellet was resuspended in 50 μl methoxyamine hydrochloride solution (20 mg ml−1 in pyridine) and incubated with constant agitation for 2 h at 60 °C. Acidic protons were then trimethylsilylated with 200 μl N-methyl-N-trimethylsilyltrifluoro acetamide (MSTFA) and incubated at 60 °C with constant agitation for 30 min. Samples were left to stand at room temperature overnight to ensure the reaction was complete, and then filtered through tissue paper to remove any particulate matter. Metabolites were extracted from tissue samples (cambial scrapings and tissue cultures) and prepared for HPLC analysis by extracting 200 mg of liquid-nitrogen frozen, ground tissue in 1.5 ml of methanol:water:acetic acid (48.5:48.5:1.5 by vol.) at 60 °C for 4 h. Following incubation, the samples were centrifuged for 10 min at 13 000 rpm, and the supernant recovered. Equal volumes of ethyl ether were added and the sample mixed and allowed to phase separate. The upper fraction was removed and retained. The sample was then extracted a second time with ethyl ether, collected, pooled, and dried under vacuum. Samples were resuspended in 200 μl of methanol and analyzed using reverse phase HPLC. GC-MS analysis GC-MS analysis was conducted on a ThermoFinnigan Trace GC-PolarisQ ion trap system fit with an AS2000 auto-sampler and a split/splitless injector. The GC was equipped with a low-bleed Restek Rtx-5MS column (fused silica, 30 m, 0.25 mm ID, stationary phase diphenyl 5% dimethyl 95% polysiloxane). The GC conditions were set as follows: inlet temperature 250 °C, helium carrier gas flow at constant 1 ml min−1, injector split ratio 10:1, resting oven temperature 70 °C, and GC-MS transfer line temperature 300 °C. Following injection of a 1 μl sample, the oven was held at 70 °C for 2 min and then ramped to 325 °C at a rate of 8 °C min−1. The temperature was held at 325 °C for 6 min before being cooled rapidly to 70 °C in preparation for the next run. For MS analysis in positive electron ionizaton (EI) mode, the fore-line was vacated to approximately 40 mTorr, with helium gas flow into the chamber set at 0.3 ml min−1. The source temperature was held at 250 °C, with an electron ionizaton potential of 70 eV. The detector signal was recorded from 3.35 min after injection until 35.5 min, and ions were scanned across the range of 50–650 mu (mass units) with a total scan time of 0.58 s. HPLC analysis Phenolic metabolite composition was determined by reverse phase high performance liquid chromatography (HPLC) on a Summit chromatograph (Dionex, Sunnyvale, CA). Separation was achieved on a Symmetry C18 250×2.0 mm reverse phase column (Waters), and detected by a photodiode array detector. Samples were filtered prior to injection (50 μl). The column was eluted with a linear gradient of 5% 95:5 water:acetic acid (v/v) to 100% 25% acetonitrile (v/v) in 95:5 water:acetic acid (v/v) over 70 min at a flow rate of 1.0 ml min−1. Data processing and statistical analysis ThermoFinnigan ‘Xcalibur’ software was used for both GC-MS data collection and peak identification and measurement. The grouping of peaks that represented the same compound in multiple chromatograms was automated using the in-house, purpose-built ‘PeakMatch’ software. Data reduction by principal components analysis (PCA) was carried out using the Statistical Package for the Social Sciences (SPSS v12.0). All other intermediate data manipulation was carried out using Microsoft Excel 2000. Results and discussion Suspension cultures Established suspension cultures generated from wild-type and C4H-F5H modified lines (F5H-82 and F5H-64) grew at similar rates, and showed characteristic lag, linear, and static phases of growth over a 9-d period (Fig. 1). As such, samples taken at day 7 for metabolite profiling were from cultures in the transition from linear growth to the static phase. Expression of the Arabidopsis F5H transgene in suspension cultures was confirmed by semi-quantitative RT-PCR (image not shown). There was no detectable expression of the Arabidopsis F5H transgene in the non-transformed wild-type control, as expected. However, even under the highly controlled conditions of suspension culture, which did not promote organ-specific differentiation, the modified genotypes continued to express the transgene and maintain phenotypes that differed from one another as well as from the wild-type control. The cultures also exhibited distinct morphologies, with wild-type cells being white in colour, F5H-82 greenish, and F5H-64 displaying a distinct brown colour (Fig. 2). Furthermore, the wild-type cultures were visually finer cultures with smaller cell aggregates, whereas the transgenic cultures tended to be composed of larger cellular aggregates. Colour changes have been observed in the wood of trees from modified poplar lines in which the lignin content or the S:G ratio has been increased (Pilate et al., 2002), and it is possible that the colour changes observed in both wood and suspension-cultured tissue reflect similar biochemical phenomena. In the case of C4H-F5H, it is likely that the colour is due to the product(s) of a pathway fed by an abundance of an over-supplied syringyl lignin biosynthetic pathway. Despite the continued expression of the transgene in suspension cultures, ultraviolet microscopy revealed no evidence of secondary wall development (images not shown). A possible explanation for the continued activity of the secondary development-specific C4H promoter, in the absence of both secondary development and lignin polymer biosynthesis, is that phenylpropanoid biosynthesis is frequently induced during times of environmental stress; this is probably the case in these liquid cultures. Fig. 1. View largeDownload slide Growth characteristics of wild-type and two C4H-F5H transformed P. tremula×alba suspension cultures based on settled cell volume. Plots represent the mean of 12 replicates, and error bars represent a 95% confidence interval of the mean. Arrow indicates sampling time for metabolite profiling. Fig. 1. View largeDownload slide Growth characteristics of wild-type and two C4H-F5H transformed P. tremula×alba suspension cultures based on settled cell volume. Plots represent the mean of 12 replicates, and error bars represent a 95% confidence interval of the mean. Arrow indicates sampling time for metabolite profiling. Fig. 2. View largeDownload slide Suspension-cultured tissue of wild-type and two C4H-F5H transformed P. tremula×alba lines. Picture was taken 14 d after subculture. Watch glass diameter is approximately 6.5 cm. Fig. 2. View largeDownload slide Suspension-cultured tissue of wild-type and two C4H-F5H transformed P. tremula×alba lines. Picture was taken 14 d after subculture. Watch glass diameter is approximately 6.5 cm. Metabolite data acquisition and compiling To elucidate the metabolites present in both actively dividing cambial and suspension-cultured tissue, total ion chromatograms (TIC) of each sample, wild-type and transgenic, were obtained by GC-MS analysis of TMS-derivatives from crude tissue extracts. Analysis of the cambial zone included samples from 15 wild-type and 10 F5H-64 individual tree clones. The analysis of suspension cultures included samples from 20 distinct cultures of each of the wild-type, F5H-82, and F5H-64 lines (60 cultures in total), which were sampled during the transition from linear to static culture growth, 7 d after subculture. For all recorded peaks, total ion counts remained within the linear detection range of the instrument (approximately 1.0 e4–3.0 e8 counts s−1). In preliminary calculations, each peak in a chromatogram was expressed relative to the area of the ribitol internal standard peak. In addition, peak areas were normalized across all chromatograms (of cambium or suspension culture datasets) by adjusting for the exact amount of tissue (mg fresh weight) used in each sample extraction. In order to circumvent the wobble in retention time for any given compound, a single-pass algorithm (‘PeakMatch’) was designed to group peaks from multiple chromatograms that have similar retention times based on a user-assigned threshold. It has been well recognized that one of the limitations of metabolomics has been the difficulty in automating the process of grouping peaks that represent the same compound in multiple chromatograms (Fiehn, 2001, 2002; Fiehn et al., 2001; Fiehn and Weckwerth, 2003). However, automation is a necessity when analysing large numbers of replicates displaying hundreds of peaks typical in GC-MS total ion chromatograms from plant metabolite extracts. To avoid a total-chromatogram-alignment-by-data-point approach such as that used in correlation optimized warping (COW) (Nielsen et al., 1998), and to identify peaks and use peak area to measure compound quantity without warping, alternate software that can match peaks while accommodating the variability in retention time must be employed. In this study, PeakMatch served as a highly effective tool for rapidly compiling large datasets and accomplishing the necessary comparisons of the same compound in different samples. After being compiled in PeakMatch, but prior to statistical analysis, datasets were cleaned of all superfluous peaks not directly related to the sample. These included the internal ribitol standard, solvent impurities, and any peaks from the reagents used in the derivatizaton process (linear siloxane chains and other silyl compounds). The retention times of such peaks were identified from the TIC chromatograms of pyridine solvent blanks, and sample blanks in which the extraction and derivatizaton were carried out in the absence of any sample tissue. In addition, all but the most prominent peaks eluting after 30 min were excluded from the analysis, as beyond this time the signal-to-noise ratio declined drastically due to the heavy convolution in the high-mass tail end typical of GC-MS analyses. To maintain uniformity across the dataset, the sensitivity of peak finding must remain fixed across all chromatograms, although a particular setting will be more or less appropriate for different chromatograms. As a consequence, minor peaks are often detected inconsistently, despite being visible in the chromatogram. To reduce the noise introduced by this erroneous non-detection of minor peaks, peaks sets were thinned in two ways. All peaks detected in ≤10% of samples from each plant line, and all peaks whose average normalized area for each plant line were less than a specific value (≤1.0 E−4 for cambium and ≤5.0 E−5 for suspension culture) were not considered. With completion of all adjustments, the final cambium and suspension culture datasets contained 143 and 182 peaks, respectively. Principal components analysis Principal components analysis was conducted separately on cambium and suspension culture peak sets. For the cambium dataset, 22 principal components were required to account for >99% of the variance between the 143 peaks across all 25 samples (total 3575 data points) (Fig. 3). This represents roughly an 85% reduction in variables. Similarly, for the suspension dataset, 48 principal components were required to account for >99% of the variance between 60 samples across all 182 peaks (total 10 920 data points) (Fig. 3). This represents approximately a 74% reduction in variables. The considerable reduction in variables achieved by PCA suggests the existence of strong relationships between the variables within datasets. Fig. 3. View largeDownload slide Cumulative percentage of dataset variation explained by principal components, for both cambium and suspension cultures. Fig. 3. View largeDownload slide Cumulative percentage of dataset variation explained by principal components, for both cambium and suspension cultures. Plotting the factor scores of individual samples from selected principal components, as coordinates on the axes of two- or three-dimensional scatter plots, can generate a graphical representation of the relationship between samples in a PCA. The separation of clusters of samples in such a plot illustrates the existence of differences between distinct metabolic systems (Fiehn et al., 2000a; Roessner et al., 2001a, b; Chen et al., 2003; Fiehn, 2003; Morris et al., 2004). Standard plots are limited to three dimensions, and the components plotted should be those that best represent the dataset. This implies that the components plotted are those that account for the most variance (i.e. the first, second, and third components). However, specific latter components have also been shown to be effective in revealing differences between sample groups in some situations (Fiehn et al., 2000a). In such cases it is often more useful to plot factor scores from these discriminating components. In this study three, two-dimensional scatter plots were generated for each dataset using component pair combinations from the first three principal components (Figs 4, 5). Together, these three principal components accounted for approximately 46% and 52% of the variance in the cambium and suspension culture datasets, respectively (Table 1A). The cambium plots (Fig. 4) clearly illustrate that both principal components 2 and 3 (PC-2 and PC-3) distinguished between the wild-type and F5H-64 samples, with PC-2 being more effective. By contrast, PC-1 made no such distinction. It follows that the best visualizaton of separation between the two lines is achieved when PC-2 and PC-3 are combined (Fig. 4C). In this case, loose clustering and complete separation of the wild-type and F5H-64 samples are observed, with these phenomena derived primarily from PC-2, but accentuated by PC-3. Furthermore, clustering of wild-type samples in this plot is visibly a tighter grouping than that of F5H-64 samples. By comparison, the suspension culture plots (Fig. 5) show that, in this PCA, PC-1 distinguished between wild-type, F5H-82, and F5H-64, while PC-2 distinguished F5H-82 from the others. Here, it was PC-3 that failed to effectively distinguish the lines. Therefore, in this case the best visualizaton of separation between the three lines is achieved when PC-1 and PC-2 are combined (Fig. 5A). This plot illustrates a tight clustering of the three lines, with visible improvement from F5H-82 to F5H-64 and then to WT (barring the outlier). Furthermore, all three lines separate cleanly and equally from one another, with the F5H-82 cluster separating from the others in PC-2 such that there is a very clear overall separation. Fig. 4. View largeDownload slide Scatter plots of PCA factor scores for wild-type and F5H-64 modified samples from the cambium dataset. Axes of two-dimensional plots are derived from (A) PC-1 and PC-2, (B) PC-1 and PC-3, and (C) PC-2 and PC-3. Plotted points represent individual samples, while arbitrary ellipses have been included to assist interpretation and simply border all samples of individual lines. This PCA analysis represents the differentiation of 25 individual trees (15× wild-type and 10× F5H-64). Fig. 4. View largeDownload slide Scatter plots of PCA factor scores for wild-type and F5H-64 modified samples from the cambium dataset. Axes of two-dimensional plots are derived from (A) PC-1 and PC-2, (B) PC-1 and PC-3, and (C) PC-2 and PC-3. Plotted points represent individual samples, while arbitrary ellipses have been included to assist interpretation and simply border all samples of individual lines. This PCA analysis represents the differentiation of 25 individual trees (15× wild-type and 10× F5H-64). Fig. 5. View largeDownload slide Scatter plots of PCA factor scores for wild-type and C4H-F5H transformed P. tremula×alba samples from the suspension culture dataset. Axes of two-dimensional plots are derived from (A) PC-1 and PC-2, (B) PC-1 and PC-3, and (C) PC-2 and PC-3. Plotted points represent individual samples, while arbitrary ellipses have been included to assist interpretation and simply border all samples of individual lines. This PCA analysis represents the differentiation of 60 individual suspension cultures (20 individual samples per line). Fig. 5. View largeDownload slide Scatter plots of PCA factor scores for wild-type and C4H-F5H transformed P. tremula×alba samples from the suspension culture dataset. Axes of two-dimensional plots are derived from (A) PC-1 and PC-2, (B) PC-1 and PC-3, and (C) PC-2 and PC-3. Plotted points represent individual samples, while arbitrary ellipses have been included to assist interpretation and simply border all samples of individual lines. This PCA analysis represents the differentiation of 60 individual suspension cultures (20 individual samples per line). Table 1A. Percentage of total variance accounted for by combinations of the first three principal components of cambium and suspension-culture datasets Component(s)   Cambium   Suspension   1  24.34%  26.07%  2  11.22%  13.46%  3  10.83%  12.33%  1, 2  35.56%  39.53%  2, 3  22.05%  25.79%  1, 2, 3   46.39%   51.86%   Component(s)   Cambium   Suspension   1  24.34%  26.07%  2  11.22%  13.46%  3  10.83%  12.33%  1, 2  35.56%  39.53%  2, 3  22.05%  25.79%  1, 2, 3   46.39%   51.86%   Combinations revealing the greatest distinction between samples of different lines are in bold type. View Large Table 1B. Molecule classification of the metabolites loading highly in factors of PCA component matrices for the first three principal components Molecule type  Cambium       Suspension         PC-1   PC-2   PC-3   PC-1   PC-2   PC-3   Other  8  6  3  12  11  8  Amino acid  16  3  1  3  1  7  Benzene  1  2  0  4  1  0  Carbohydrate  13  6  19  35  7  3  Total   38   17   23   54   20   18   Molecule type  Cambium       Suspension         PC-1   PC-2   PC-3   PC-1   PC-2   PC-3   Other  8  6  3  12  11  8  Amino acid  16  3  1  3  1  7  Benzene  1  2  0  4  1  0  Carbohydrate  13  6  19  35  7  3  Total   38   17   23   54   20   18   Numbers represent the number of molecules from the stated class that load high in specific principal components. View Large It is evident from the scatter plots in Figs 4 and 5 that the PCA detected differences between the metabolisms of the three phenotypically distinct lines, resulting from single gene insertion events. Visual evidence of this can be seen in selected two-dimensional plots (Figs 4C, 5A), where samples from each line cluster together, and separately from the samples of other lines. This observation supports the theory that differences in wood chemistry can indeed be associated with differences in observable metabolic traits. However, what PCA achieves, and what the correct interpretation of clustering and separation in PCA scatter plots should be, is not entirely simple and warrants discussion. Clustering does not necessarily indicate that those samples in a cluster contain, in this case, a similar abundance of the various metabolites detected. Likewise, neither does the separation of clusters necessarily indicate absolute differences. Rather, clustering of samples in PCA indicates a similarity in the behaviour of variables in relation to one another. Samples that clustered together in this study did so because they each contained a similar set of metabolites whose abundances were correlated in the same way. An accurate interpretation, therefore affords the results from PCA greater relevance in the context of comparing biochemical systems. The power of this approach lies in that it is based not on isolated comparison of the abundance of individual metabolites in different systems, but instead accounts for the dynamic nature of metabolism, and provides insight into metabolic relationships. A comparison of cambium (Fig. 4C) and suspension culture (Fig. 5A) plots reveals differences between the PCA clustering patterns of samples taken from the two sources. It is apparent that clustering and separation is more defined for suspension culture samples than it is for cambial samples. This may be due to differences in the degree of environmental variability experienced by tissue derived from the two sources. Actively growing trees will have experienced long-term and recurring differences in temperature, relative humidity, light, water availability, space, and insect herbivory, despite greenhouse climate control. Environmental factors such as these can cause variation in the growth, morphology and presumably metabolism of trees of the same genotype. By contrast, sterile tissue culture grown under strictly controlled laboratory conditions most probably experiences less long-term culture-to-culture environmental variability and, consequently, exhibit reduced morphological and biochemical variation. As such, replicate samples of the same genotype show less variability in suspension culture than they do as greenhouse-grown trees, as illustrated by a comparison of the ‘tightness’ of clustering in PCA. A trend observed across both scatter plots is that the wild-type samples tend to cluster more tightly than the modified samples. This suggests increased metabolic and phenotypic variability in the modified genotypes, compared with the non-transformed, wild-type control. Elucidating individual metabolites Having established that metabolite profiling coupled with principal components analysis could be employed to distinguish the general metabolism of different lines, the natural progression was to characterize the metabolic traits underlying the clustering and separation phenomena. For this, the component matrix of the PCA was screened for variables (metabolites) with high loadings in the factors of the specific principal components that produced clustering and separation in the scatter plots. The greater the loading, the more the variable is a pure measure of the factor (Tabachnick and Fidell, 2001), and the more influence it has on the generation of the principal component. Therefore, high-loading variables are responsible for generating clusters and separation in principal components where these phenomena occur. It has been suggested that loadings in excess of 0.71 are ‘excellent’, 0.63 ‘very good’, 0.55 ‘good’, 0.45 ‘fair’, and 0.32 ‘poor’ (Comrey and Lee, 1992). In this study metabolites with at least ‘fair’ loadings were extracted from the component matrix for the first three principal components of cambium (Table 2) and suspension-culture (Table 3) datasets. National Institute of Standards and Technology (NIST) MS-Search software equipped with the NIST mass spectra, as well as the Max Planck Institute Trimethylsilane (TMS) (http://www.mpimp-golm.mpg.de/mms-library/index-e.html) and our own (Mansfield laboratory) TMS mass spectral libraries was used to assist with the identification of these metabolites. Compounds with high-scoring matches (based on mass spectrum and retention time) were assigned identities and classified as ‘amino acid’, ‘phenolic’, ‘carbohydrate’, or ‘other’ (including sterols, phosphates, components of the citric acid cycle and adjunct pathways) molecules. Table 2. Metabolites in the cambium dataset that load highly in factors of the PCA component matrix (A) PC-1, (B) PC-2, and (C) PC-3 Class   Peak no.   Loading   Identity   (A) Cambium PC-1        Other  11  0.60  Acetimidic acid 2TMS    17  0.52  2-Amino ethanol 3TMS    18  0.75  Phosphoric acid 3TMS    30  0.54  Unknown (other, 2c)    38  0.65  4-Aminobutyric acid 3TMS    69  0.70  Ornithine 4TMS    70  0.65  Citric acid 4TMS    141  0.55  Unknown (other, 7c)  Amino acid  15  0.61  Valine 2TMS    21  0.62  Glycine 3TMS    27  0.54  Serine 3TMS    28  0.88  Threonine 3TMS    32  0.74  Unknown (amino acid, 1c)    35  0.49  Asparagine 2TMS    37  0.82  Aspartic acid 3TMS    42  0.89  Unknown (amino acid, 2c)    44  0.91  Unknown (amino acid, 3c)    45  0.65  Asparagine TMS    46  0.75  Unknown (amino acid, 4c)    48  0.72  Unknown (amino acid, 5c)    51  0.80  Asparagine 3TMS    83  0.67  Lysine monohydrochloride 4TMS    87  0.76  Tyrosine 3TMS    112  0.74  Tryptophan 3TMS  Benzene  133  0.79  p-Nitrophenyl-glucoside  Carbohydrate  101  0.74  Glucaric or galactaric acid    108  0.60  Unknown (carbohydrate, 9c)    120  0.65  Unknown (carbohydrate, 12c)    121  0.68  Melibiose 8TMS    122  0.64  Unknown (carbohydrate, 13c)    123  0.59  Myoinositol phosphate 7TMS    127  0.61  Sucrose TMS    128  0.52  Unknown (carbohydrate, 14c)    132  0.52  Unknown (carbohydrate, 15c)    135  0.63  Unknown (carbohydrate, 16c)    138  0.66  Unknown (carbohydrate, 17c)    140  0.73  Unknown (carbohydrate, 18c)    143  0.45  Raffinose TMS  (B) Cambium PC-2        Other  7  0.47  Unknown (other, 1c)    17  0.52  2-Amino ethanol 3TMS    38  0.48  4-Aminobutyric acid 3TMS    85  0.58  Unknown (other, 5c)    94  0.49  Unknown (other, 6c)    114  0.54  Unknown (other, 7c)  Amino acid  35  0.49  Asparagine 2TMS    46  0.46  Unknown (amino acid, 4c)    59  0.47  Unknown (amino acid, 6c)  Benzene  102  0.64  sinapyl alcohol 2TMS    12  0.52  Unknown (benzene, 1c)  Carbohydrate  63  0.89  2-Deoxy D-glucose 4TMS    64  0.67  Unknown (carbohydrate, 2c)    67  0.85  Unknown (carbohydrate, 4c)    91  0.63  Galactitol 6TMS (sorbitol)    95  0.70  Unknown (carbohydrate, 7c)    134  0.46  Cellobiose TMS  (C) Cambium PC-3        Other  50  0.51  Unknown (other, 3c)    74  0.62  Unknown (other, 4c)    142  0.46  Sitosterol TMS  Amino acid  47  0.58  Glutamic acid 3TMS  Carbohydrate  60  0.60  Unknown (carbohydrate, 1c)    61  0.63  Galacturonic acid TMS variant    65  0.53  Unknown (carbohydrate, 3c)    72  0.58  Unknown (carbohydrate, 5c)    75  0.68  Unknown (carbohydrate, 6c)    78  0.46  Sucrose TMS    80  0.51  Sucrose TMS    82  0.50  Sucrose TMS    86  0.59  Sucrose TMS    88  0.60  Galactitol 6TMS (sorbitol)    104  0.61  Unknown (carbohydrate, 8c)    107  0.47  Inositol 6TMS    108  0.49  Unknown (carbohydrate, 9c)    113  0.62  Unknown (carbohydrate, 10c)    115  0.51  Sucrose TMS    119  0.58  Unknown (carbohydrate, 11c)    122  0.54  Unknown (carbohydrate, 13c)    130  0.76  Sucrose TMS    140   0.48   Unknown (carbohydrate, 18c)   Class   Peak no.   Loading   Identity   (A) Cambium PC-1        Other  11  0.60  Acetimidic acid 2TMS    17  0.52  2-Amino ethanol 3TMS    18  0.75  Phosphoric acid 3TMS    30  0.54  Unknown (other, 2c)    38  0.65  4-Aminobutyric acid 3TMS    69  0.70  Ornithine 4TMS    70  0.65  Citric acid 4TMS    141  0.55  Unknown (other, 7c)  Amino acid  15  0.61  Valine 2TMS    21  0.62  Glycine 3TMS    27  0.54  Serine 3TMS    28  0.88  Threonine 3TMS    32  0.74  Unknown (amino acid, 1c)    35  0.49  Asparagine 2TMS    37  0.82  Aspartic acid 3TMS    42  0.89  Unknown (amino acid, 2c)    44  0.91  Unknown (amino acid, 3c)    45  0.65  Asparagine TMS    46  0.75  Unknown (amino acid, 4c)    48  0.72  Unknown (amino acid, 5c)    51  0.80  Asparagine 3TMS    83  0.67  Lysine monohydrochloride 4TMS    87  0.76  Tyrosine 3TMS    112  0.74  Tryptophan 3TMS  Benzene  133  0.79  p-Nitrophenyl-glucoside  Carbohydrate  101  0.74  Glucaric or galactaric acid    108  0.60  Unknown (carbohydrate, 9c)    120  0.65  Unknown (carbohydrate, 12c)    121  0.68  Melibiose 8TMS    122  0.64  Unknown (carbohydrate, 13c)    123  0.59  Myoinositol phosphate 7TMS    127  0.61  Sucrose TMS    128  0.52  Unknown (carbohydrate, 14c)    132  0.52  Unknown (carbohydrate, 15c)    135  0.63  Unknown (carbohydrate, 16c)    138  0.66  Unknown (carbohydrate, 17c)    140  0.73  Unknown (carbohydrate, 18c)    143  0.45  Raffinose TMS  (B) Cambium PC-2        Other  7  0.47  Unknown (other, 1c)    17  0.52  2-Amino ethanol 3TMS    38  0.48  4-Aminobutyric acid 3TMS    85  0.58  Unknown (other, 5c)    94  0.49  Unknown (other, 6c)    114  0.54  Unknown (other, 7c)  Amino acid  35  0.49  Asparagine 2TMS    46  0.46  Unknown (amino acid, 4c)    59  0.47  Unknown (amino acid, 6c)  Benzene  102  0.64  sinapyl alcohol 2TMS    12  0.52  Unknown (benzene, 1c)  Carbohydrate  63  0.89  2-Deoxy D-glucose 4TMS    64  0.67  Unknown (carbohydrate, 2c)    67  0.85  Unknown (carbohydrate, 4c)    91  0.63  Galactitol 6TMS (sorbitol)    95  0.70  Unknown (carbohydrate, 7c)    134  0.46  Cellobiose TMS  (C) Cambium PC-3        Other  50  0.51  Unknown (other, 3c)    74  0.62  Unknown (other, 4c)    142  0.46  Sitosterol TMS  Amino acid  47  0.58  Glutamic acid 3TMS  Carbohydrate  60  0.60  Unknown (carbohydrate, 1c)    61  0.63  Galacturonic acid TMS variant    65  0.53  Unknown (carbohydrate, 3c)    72  0.58  Unknown (carbohydrate, 5c)    75  0.68  Unknown (carbohydrate, 6c)    78  0.46  Sucrose TMS    80  0.51  Sucrose TMS    82  0.50  Sucrose TMS    86  0.59  Sucrose TMS    88  0.60  Galactitol 6TMS (sorbitol)    104  0.61  Unknown (carbohydrate, 8c)    107  0.47  Inositol 6TMS    108  0.49  Unknown (carbohydrate, 9c)    113  0.62  Unknown (carbohydrate, 10c)    115  0.51  Sucrose TMS    119  0.58  Unknown (carbohydrate, 11c)    122  0.54  Unknown (carbohydrate, 13c)    130  0.76  Sucrose TMS    140   0.48   Unknown (carbohydrate, 18c)   Only metabolites loading >0.45 in the component matrix are shown. Metabolites are sorted first by molecule class, and then by sequence of elution in gas chromatography (all peaks extracted from chromatography for PCA were assigned a number based on elution sequence). The loading of each peak is shown, and, where possible, metabolites are identified. Those that could not be identified are labelled as ‘Unknown’, with details in parentheses (molecule type, a number based on the elution sequence, and a letter ‘c’ indicating cambium). View Large Table 3. Metabolites in the suspension culture datasets that load highly in factors of the PCA component matrix (A) PC-1, (B) PC-2, and (C) PC-3 Class   Peak no.   Loading   Identity   (A) Suspension PC-1        Other  25  0.75  Unknown (other, 2s)    27  0.79  Propanedioic acid 2TMS    36  0.70  Phosphoric acid 3TMS    50  0.52  Unknown (other, 5s)    56  0.88  2-Methylmalic acid 3TMS    57  0.71  Malic acid 3TMS    71  0.61  3-OH,3-CH3 pentanedioic acid 3TMS    83  0.48  Unknown (other, 7s)    90  0.81  Unknown (other, 8s)    107  0.67  Quinic acid TMS    123  0.47  Unknown (other, 9s)    135  0.48  Unknown (other, 10s)  Amino acid  19  0.58  Alanine 2TMS    62  0.57  Glutamic acid 2TMS    66  0.46  Unknown (amino acid, 1s)  Benzene  43  0.47  1-Methyl-2-phenyl-ethylamine 2TMS    64  0.56  Unknown (benzene, 1s)    173  0.48  Epicatechin    175  0.47  Unknown (benzene, 2s)  Carbohydrate  74  0.80  Xylonic acid lactone 3TMS    75  0.52  Ribonic acid lactone TMS    80  0.52  Fucose TMS    81  0.69  Ribose meox 4TMS    86  0.83  Xylitol 5TMS    87  0.51  n-Acetyl glucosamine MEOX 4TMS    92  0.56  Glucose-1-phosphate oxim TMS    96  0.52  Unknown (carbohydrate, 1s)    97  0.90  Unknown (carbohydrate, 2s)    98  0.83  Unknown (carbohydrate, 3s)    116  0.73  Sorbitol TMS    120  0.72  Glucuronic acid 5TMS    124  0.63  Gluconic acid 6TMS    126  0.75  Gluconic acid lactone 4TMS    127  0.70  Inositol 6TMS    129  0.74  Unknown (carbohydrate, 4s)    130  0.73  Unknown (carbohydrate, 5s)    132  0.64  Sucrose TMS    133  0.70  Sucrose TMS    137  0.53  Unknown (carbohydrate, 6s)    140  0.62  Unknown (carbohydrate, 7s)    144  0.58  Unknown (carbohydrate, 8s)    146  0.65  Fructose phosphate MEOX 6TMS    147  0.71  Glucose-6-phosphate TMS    148  0.78  Glucose-6-phosphate MEOX TMS    149  0.46  Unknown (carbohydrate, 9s)    150  0.58  Unknown (carbohydrate, 10s)    155  0.92  Unknown (carbohydrate, 11s)    156  0.73  Unknown (carbohydrate, 12s)    157  0.58  Unknown (carbohydrate, 13s)    160  0.59  Sucrose TMS    161  0.59  Mannopyranose phosphate 6TMS    168  0.77  Turanose 7TMS    170  0.79  Unknown (carbohydrate, 17s)    172  0.57  Melibiose MEOX TMS  (B) Suspension PC-2        Other  14  0.49  Pyruvic acid MEOX TMS    39  0.52  Succinic acid 2TMS    41  0.66  Glyceric acid 3TMS    42  0.47  Fumaric acid 2TMS    125  0.77  Palmic acid TMS (contamination)    131  0.45  Stearyl alcohol TMS (contamination)    141  0.64  Steric acid TMS (contamination)    143  0.72  Unknown (other, 11s)    176  0.70  Unknown (other, 13s)    177  0.57  Unknown (other, 14s)    181  0.46  Sitosterol TMS  Amino acid  79  0.65  Indolepropionate TMS  Benzene  99  0.51  Shikimic acid 4TMS  Carbohydrate  77  0.62  Ribose MEOX 4TMS    78  0.60  Ribose MEOX 4TMS    111  0.55  Sucrose TMS    144  0.54  Unknown (carbohydrate, 8s)    151  0.53  Myo-Iiositol phosphate 7TMS    161  0.58  Mannopyranose phosphate 6TMS    167  0.68  Unknown (carbohydrate, 16s)  (C) Suspension PC-3        Other  17  0.87  Unknown (other, 1s)    28  0.86  Unknown (other, 3s)    42  0.59  Fumaric acid 2TMS    45  0.73  Unknown (other, 4s)    52  0.57  Unknown (other, 6s)    63  0.77  4-Aminobutyric acid 3TMS    83  0.80  Unknown (other, 7s)    165  0.49  Unknown (other, 2s)  Amino acid  19  0.64  Alanine 2TMS    30  0.82  Valine 2TMS    35  0.64  Leucine 3TMS    38  0.80  Glycine 3TMS    44  0.92  Serine 3TMS    47  0.46  Threonine 3TMS (or allothreonine)    62  0.68  Glutamic acid 2TMS  Carbohydrate  163  0.76  Unknown (carbohydrate, 14s)    166  0.61  Unknown (carbohydrate, 15s)    172   0.71   Meli- or cellobiose MEOX TMS   Class   Peak no.   Loading   Identity   (A) Suspension PC-1        Other  25  0.75  Unknown (other, 2s)    27  0.79  Propanedioic acid 2TMS    36  0.70  Phosphoric acid 3TMS    50  0.52  Unknown (other, 5s)    56  0.88  2-Methylmalic acid 3TMS    57  0.71  Malic acid 3TMS    71  0.61  3-OH,3-CH3 pentanedioic acid 3TMS    83  0.48  Unknown (other, 7s)    90  0.81  Unknown (other, 8s)    107  0.67  Quinic acid TMS    123  0.47  Unknown (other, 9s)    135  0.48  Unknown (other, 10s)  Amino acid  19  0.58  Alanine 2TMS    62  0.57  Glutamic acid 2TMS    66  0.46  Unknown (amino acid, 1s)  Benzene  43  0.47  1-Methyl-2-phenyl-ethylamine 2TMS    64  0.56  Unknown (benzene, 1s)    173  0.48  Epicatechin    175  0.47  Unknown (benzene, 2s)  Carbohydrate  74  0.80  Xylonic acid lactone 3TMS    75  0.52  Ribonic acid lactone TMS    80  0.52  Fucose TMS    81  0.69  Ribose meox 4TMS    86  0.83  Xylitol 5TMS    87  0.51  n-Acetyl glucosamine MEOX 4TMS    92  0.56  Glucose-1-phosphate oxim TMS    96  0.52  Unknown (carbohydrate, 1s)    97  0.90  Unknown (carbohydrate, 2s)    98  0.83  Unknown (carbohydrate, 3s)    116  0.73  Sorbitol TMS    120  0.72  Glucuronic acid 5TMS    124  0.63  Gluconic acid 6TMS    126  0.75  Gluconic acid lactone 4TMS    127  0.70  Inositol 6TMS    129  0.74  Unknown (carbohydrate, 4s)    130  0.73  Unknown (carbohydrate, 5s)    132  0.64  Sucrose TMS    133  0.70  Sucrose TMS    137  0.53  Unknown (carbohydrate, 6s)    140  0.62  Unknown (carbohydrate, 7s)    144  0.58  Unknown (carbohydrate, 8s)    146  0.65  Fructose phosphate MEOX 6TMS    147  0.71  Glucose-6-phosphate TMS    148  0.78  Glucose-6-phosphate MEOX TMS    149  0.46  Unknown (carbohydrate, 9s)    150  0.58  Unknown (carbohydrate, 10s)    155  0.92  Unknown (carbohydrate, 11s)    156  0.73  Unknown (carbohydrate, 12s)    157  0.58  Unknown (carbohydrate, 13s)    160  0.59  Sucrose TMS    161  0.59  Mannopyranose phosphate 6TMS    168  0.77  Turanose 7TMS    170  0.79  Unknown (carbohydrate, 17s)    172  0.57  Melibiose MEOX TMS  (B) Suspension PC-2        Other  14  0.49  Pyruvic acid MEOX TMS    39  0.52  Succinic acid 2TMS    41  0.66  Glyceric acid 3TMS    42  0.47  Fumaric acid 2TMS    125  0.77  Palmic acid TMS (contamination)    131  0.45  Stearyl alcohol TMS (contamination)    141  0.64  Steric acid TMS (contamination)    143  0.72  Unknown (other, 11s)    176  0.70  Unknown (other, 13s)    177  0.57  Unknown (other, 14s)    181  0.46  Sitosterol TMS  Amino acid  79  0.65  Indolepropionate TMS  Benzene  99  0.51  Shikimic acid 4TMS  Carbohydrate  77  0.62  Ribose MEOX 4TMS    78  0.60  Ribose MEOX 4TMS    111  0.55  Sucrose TMS    144  0.54  Unknown (carbohydrate, 8s)    151  0.53  Myo-Iiositol phosphate 7TMS    161  0.58  Mannopyranose phosphate 6TMS    167  0.68  Unknown (carbohydrate, 16s)  (C) Suspension PC-3        Other  17  0.87  Unknown (other, 1s)    28  0.86  Unknown (other, 3s)    42  0.59  Fumaric acid 2TMS    45  0.73  Unknown (other, 4s)    52  0.57  Unknown (other, 6s)    63  0.77  4-Aminobutyric acid 3TMS    83  0.80  Unknown (other, 7s)    165  0.49  Unknown (other, 2s)  Amino acid  19  0.64  Alanine 2TMS    30  0.82  Valine 2TMS    35  0.64  Leucine 3TMS    38  0.80  Glycine 3TMS    44  0.92  Serine 3TMS    47  0.46  Threonine 3TMS (or allothreonine)    62  0.68  Glutamic acid 2TMS  Carbohydrate  163  0.76  Unknown (carbohydrate, 14s)    166  0.61  Unknown (carbohydrate, 15s)    172   0.71   Meli- or cellobiose MEOX TMS   Only metabolites loading >0.45 in the component matrix are shown. Metabolites are sorted first by molecule class, and then by sequence of elution in gas chromatography (all peaks extracted from chromatography for PCA were assigned a number based on elution sequence). The loading of each peak is shown, and, where possible, metabolites are identified. Those that could not be identified are labelled as ‘Unknown’, with details in parentheses (molecule type, a number based on the elution sequence, and a letter ‘s’ indicating suspension culture). View Large In the dataset from suspension cultures, the principal components derived from factor-1 and factor-2 (PC-1 and PC-2) clustered and separated all three lines. In factor-1 (Table 3A), 65% of high-loading metabolites were carbohydrates (including monomers, dimers, and their phosphorylated or acidic derivatives), which, for the most part, had loading values better than ‘good’ (as defined by Comrey and Lee, 1992). In addition, there was evidence of the inorganic phosphate pool, with a few examples of amino acids, glutamate (primary donor of the α-amino group to most amino acids), a participant in the citric acid cycle (malic acid), and a by-product of shikimic acid biosynthesis (quinic acid). Some phenolic compounds were observed, but, for the most part, barely loaded above the cut-off. With these results it is appropriate to suggest that in PC-2, the clustering and separation of all three lines with minimal overlap was heavily related to differences in carbohydrate metabolism. A similar analysis of high-loading metabolites in factor-2 (Table 3B) revealed components of the citric acid cycle (succinic acid, fumaric acid), components of the triose-phosphate pathway (glyceric and pyruvic acids), shikimic acid (precursor of many phenolic amino acids and secondary metabolites), myo-inositol phosphate (amongst other things, inositol participates in signalling pathways, hormone storage and transport, and the biosynthesis of cell walls and stress-related compounds), and a selection of early- and late-eluting carbohydrates (monomers, dimers). Although the loadings of carbohydrates are typically higher than those of other molecule types in this factor, the appearance of a series of closely related core metabolites suggests that this aspect of metabolism had a significant influence on the clustering and separation observed in PC-2. The principal components (PC-2 and PC-3) derived from factor-2 and factor-3 of the cambium dataset were used effectively to cluster and separate samples of the wild-type and F5H-64 lines, although PC-2 alone separated the lines with minimal overlap. Examples from all molecule categories were observed, although as with factor-1 of the suspension-culture dataset, carbohydrates predominate in the list of high-loading metabolites in cambium factor-2 (Table 2B). The list of high-loading metabolites in factor-3 (Table 2C) is an even more pronounced case of carbohydrate dominance, with 83% of metabolites identified as carbohydrates. The GC breakdown peaks of sucrose (which all represent the same compound) feature strongly, and it is understandable that they load highly together. Interestingly, inositol and glutamate load highly in this factor, much as they did in suspension culture factor-1 and factor-2. However, no representatives from the core citric acid and triose-phosphate pathways were observed. Therefore, it again seems appropriate to attribute the small amount of separation observed in cambium PC-3 to differences in carbohydrate metabolism. Figure 6 reveals the variety in abundance, as well as the broad range of retention time of identifiable, high-loading compounds present in the differentiating components of the cambium dataset, PC-2 and PC-3. Fig. 6. View largeDownload slide Example of a total ion chromatogram (TIC) from a cambial sample. Chromatogram has been annotated to indicate identified compounds that loaded highly in factors of PC-2 and PC-3 of the PCA. These components played a significant role in distinguishing between the metabolism of wild-type and F5H-64 suspension culture lines. Refer label numbers to Tables 2B and 2C for compound identity. The detector response (y-axis) is given in counts s−1 (cps). Fig. 6. View largeDownload slide Example of a total ion chromatogram (TIC) from a cambial sample. Chromatogram has been annotated to indicate identified compounds that loaded highly in factors of PC-2 and PC-3 of the PCA. These components played a significant role in distinguishing between the metabolism of wild-type and F5H-64 suspension culture lines. Refer label numbers to Tables 2B and 2C for compound identity. The detector response (y-axis) is given in counts s−1 (cps). Cambium PC-1 and suspension culture PC-3 are the first in the respective datasets that do not distinguish between lines (Figs 4, 5). These components do, however, carry considerable interest with regard to high-loading metabolites in the respective factors of the component matrices. In both of these factors, high-loading amino acid-related metabolites were prominent (Tables 1B, 2A, 3C). In suspension culture factor-3, 39% of high-loading metabolites were amino acids, all of which were identified. Likewise, 42% of high-loaders in cambium factor-1 were amino acid-related (of these, 69% were identified). This clustering of amino acids into factors of the first principal components that fail to distinguish between lines suggests that amino acid biosynthesis and metabolism maintained a high level of stability, despite genetic transformation with C4H-F5H. Notably, the aromatic amino acids tyrosine and tryptophan were observed as very high loaders in cambium factor-1. In some plant species, tyrosine can be used as a precursor in hydroxycinnamic acid biosynthesis (Deluca et al., 1988; Whetten and Sederoff, 1995; Alemanno et al., 2003), and as a precursor to pigments and defence compounds such as alkaloids (Facchini, 2001), flavonoids (Koch et al., 1995), and anthocyanins (Sakuta et al., 1991; Dube et al., 1992). Tryptophan is used in some plant species as a precursor to bioactive alkaloids (Facchini, 2001) and defence phytoalexins (Zhao and Last, 1996; Pedras et al., 2003), as well as the phytohormone auxin (indole 3-acetic acid) (Bartel, 1997). As major products of the shikimic acid pathway, and molecules that are synthesized in close proximity to the usual precursor of monolignol biosynthesis, phenylalanine, the observed behaviour of tyrosine and tryptophan is intriguing. The tight association of tyrosine with a principal component that did not distinguish between wild-type and modified lines suggests that, in this case, any flux of resources through this branch of metabolism and into monolignol biosynthesis was not affected by the transformation event. This would agree with the wood chemistry of the modified phenotype, in which the total lignin content (as determined by Klason analysis) was comparable to the control (Huntley et al., 2003). Notably, none of the aromatic amino acids were observed as high-loaders in suspension culture factor-3, and their absence may be related to an absence of predation in suspension culture. Interestingly, phenylalanine was not present in either the cambium or suspension-culture datasets. A series of amino acids not directly related to phenolic secondary metabolism were identified as high-loaders in factors of the non-differentiating principal components. Three of the four major nitrogen assimilation amino acids (Suarez et al., 2002) were observed: glutamate in suspension culture factor-3, and aspartate and asparagine in cambium factor-1. Also, the aspartate-derived amino acid, threonine, was identified in both cambium factor-1 and suspension-culture factor-3. This amino acid is the precursor to isoleucine, a branched chain amino acid (Giovanelli et al., 1988). Valine and leucine, two other branched chain amino acids, were identified in cambium factor-1 and suspension-culture factor-3, respectively. Branched chain amino acids are precursors to secondary metabolism, and are involved in the biosynthesis of cyanogenic glycosides, glucosinolates, and acyl sugars (Conn, 1980). Metabolite channelling Surprisingly, very few phenolic compounds are found in the lists of high-loading metabolites from the PCA. The GC-MS analysis detected rather few phenolic metabolites, and only one compound, sinapyl alcohol, was identified as a direct intermediate of the phenylpropanoid pathway for lignin monomer biosynthesis (reviewed by Dixon et al., 2001). Clearly, however, there is an abundance of phenolic compounds synthesized in living plant tissue as either intermediates in, or endpoints of, metabolic pathways. Hypothetically, the concept of ‘metabolite channelling’ may provide an explanation for these observations. A metabolic channel exists when metabolic intermediates are covalently bound to, and passed between, sequential active sites of a multi-functional enzyme or a multi-enzyme complex (Srere, 1987, 2000; Hrazdina and Jensen, 1992). It is postulated that this arrangement typically occurs where chemical intermediates have no other cellular function except in that particular biosynthetic pathway. When a metabolic channel exists, free pools of chemical intermediates are extremely small, if they exist at all. In this way, cellular solvent capacity is spared for the regulation and efficiency of the metabolic sequence, and also for containment of molecules having cytotoxic properties. Metabolic channels are thought to exist in many branches of plant secondary biosynthesis, and there is good evidence to suggest their participation in the complex regulation of resource partitioning from the end of the shikimate pathway into and through numerous divergent pathways, notably those of flavanoid and lignin biosynthesis (Anterola et al., 1999; Rasmussen and Dixon, 1999; Winkel-Shirley, 1999; Achnine et al., 2004). The results presented here, and those of Achnine et al. (2004) clearly indicate that analogous channelling mechanisms exist in the biosynthesis of phenolic compounds, and specifically in this case in poplar tree species. Traditional reverse phase HPLC was employed in order to validate the isolation and identification of monolignol precursors (Fig. 7). HPLC clearly demonstrated and confirmed (GC-MS) that the only lignin precursor that accumulated (pooled) differentially in the C4H-F5H transgenic when compared with wild-type plants was sinapyl alcohol. Given the location of the F5H in the lignin biosynthetic pathway, 5-hydroxyconiferaldehyde should accumulate in the differentiating cambial zone, should channelling not be occurring. This compound was not identified by either HPLC or GC-MS (verified by retention time and mass spectra from synthesized compound). Fig. 7. View largeDownload slide Reverse phase high performance liquid chromtograph of cambial sample of wild-type and C4H-F5H transgenic plants following acid methanol extraction and detection at 280 nm. Fig. 7. View largeDownload slide Reverse phase high performance liquid chromtograph of cambial sample of wild-type and C4H-F5H transgenic plants following acid methanol extraction and detection at 280 nm. Limited detection of phenolic molecules may be related to the choice of analytical tools. Even with sample derivatizaton, the molecular weight cut-off of gas chromatography ranges between 800–1000 Da. Once derivatized, many phenolic and other compounds produced in plant tissues are larger than this and may not be resolved by GC-MS. Notably, this includes the glycosylated phenylpropanoid molecules thought to be storage and/or transportation forms of the monomers for lignin polymer assembly (Samuels et al., 2002). Given the functional role of F5H in lignin biosynthesis, located in the latter part of the phenylpropanoid pathway prior to the biosynthesis of glycosylated phenylpropanoids, there is a possibility that the direct metabolic impact of F5H up-regulation could be visible in the relative abundances of glycosylated monolignols. In order to resolve such large metabolites from crude tissue extracts, further analysis using complementary analytical techniques that have higher mass cut-offs is currently underway. To this end, extension of the research presented will focus on applying LC-MS-based profiling tools to the study of metabolism in this same poplar model system. Metabolite profiling of crude extracts derived from the cellular ‘bulk’ phase is confounded by another important limitation. It is not possible to detect, measure, or identify ‘product’ metabolites that establish physical associations with cellular structural components in the course of metabolism, and maintain them during extraction procedures. This point may be of great significance in the study of cell wall and wood biosynthesis by metabolite profiling. Pyrolysis-MS, with its ability to liberate entire tissue samples and analyse the resulting compounds may provide a solution to this, and is another analytical technique that warrants investigation. Summary Metabolite profiling analysis of compounds, which exhibit cellular pooling in the cambium and suspension-cultured tissue of hybrid poplar, revealed multiple series of metabolites that correlated with one another, in terms of relative abundance. The metabolic interaction networks represented by these series were either affected by a lignin-related C4H-F5H genetic modification, or remained consistent despite it. Thus, it was possible to distinguish between wild-type and modified lines exhibiting a range of phenotypic severity, on the basis of observable metabolic traits. Of particular interest were the apparent consistency of the amino acid-related pools between wild-type and transgenic lines, and the heavy role of carbohydrates in distinguishing between lines, despite a modification that related specifically to lignin biosynthesis. Using GC-MS and traditional reverse phase HPLC it was not possible to detect any intermediate metabolites (i.e. 5-hydroxyconiferaldehyde) that related directly to the C4H-F5H genetic modification. This suggests that bulk phase pools of phenolic secondary metabolites do not exist, and metabolite channelling during cell wall lignification occurs in the cambial and suspension cultures. This research has established an effective approach to the investigation of global metabolism in a model tree system, poplar. By analysing the relationships that exist between abundances of the small molecules that pool in plant tissue, it has been possible to define certain aspects of the metabolic space that links gene expression and phenotypic character. Funding for this project from the NSERC strategic program as well as CellFor Inc. is gratefully acknowledged. The authors would like to recognize the Top Achiever Doctoral Scholarship (Bright Future Scholarships, Tertiary Education Commission, Wellington, NZ) held by A Robinson. Finally, the authors would like to thank Robert St-Aubin for help in coding ‘PeakMatch’ and Dr C Chapple for his generous gift of transgenic plants. References Achnine L, Blancaflor EB, Rasmussen S, Dixon RA. 2004. Colocalization of L-phenylalanine ammonia-lyase and cinnamate 4-hydroxylase for metabolic channeling in phenylpropanoid biosynthesis. The Plant Cell  16, 3098–3109. Google Scholar Alemanno L, Ramos T, Gargadenec A, Andary C, Ferriere N. 2003. Localization and identification of phenolic compounds in Theobroma cacao L. somatic embryogenesis. Annals of Botany  92, 613–623. Google Scholar Anterola AM, Lewis NG. 2002. Trends in lignin modification: a comprehensive analysis of the effects of genetic manipulations/mutations on lignification and vascular integrity. Phytochemistry  61, 221–294. Google Scholar Anterola AM, van Rensburg H, van Heerden PS, Davin LB, Lewis NG. 1999. Multi-site modulation of flux during monolignol formation in loblolly pine (Pinus taeda). Biochemical and Biophysical Research Communications  261, 652–657. Google Scholar Bartel B. 1997. Auxin biosynthesis. Annual Review of Plant Physiology and Plant Molecular Biology  48, 49–64. Google Scholar Brunner AM, Busov VB, Strauss SH. 2004. Poplar genome sequence: functional genomics in an ecologically dominant plant species. Trends in Plant Science  9, 49–56. Google Scholar Campbell MM, Brunner AM, Jones HM, Strauss SH. 2003. Forestry's fertile crescent: the application of biotechnology to forest trees. Plant Biotechnology Journal  1, 141–154. Google Scholar Chen F, Duran AL, Blount JW, Sumner LW, Dixon RA. 2003. Profiling phenolic metabolites in transgenic alfalfa modified in lignin biosynthesis. Phytochemistry  64, 1013–1021. Google Scholar Comrey AL, Lee HB. 1992. A first course in factor analysis, 2nd edn. Hillsdale: Lawrence Erlbaum Associates. Google Scholar Conn EE. 1980. Cyanogenic compounds. Annual Review of Plant Physiology and Plant Molecular Biology  31, 433–451. Google Scholar Deluca V, Fernandez JA, Campbell D, Kurz WGW. 1988. Developmental regulation of enzymes of indole alkaloid biosynthesis in Catharanthus roseus. Plant Physiology  86, 447–450. Google Scholar Dixon RA, Chen F, Guo D, Parvathi K. 2001. The biosynthesis of monolignols: a ‘metabolic grid’, or independent pathways to guaiacyl and syringyl units? Phytochemistry  57, 1069–1084. Google Scholar Donaldson LA. 2001. Lignification and lignin topochemistry: an ultrastructural view. Phytochemistry  57, 859–873. Google Scholar Dube A, Bharti S, Laloraya MM. 1992. Inhibition of anthocyanin synthesis by cobaltous ions in the 1st internode of Sorghum bicolor L. Moench. Journal of Experimental Botany  43, 1379–1382. Google Scholar Facchini PJ. 2001. Alkaloid biosynthesis in plants: biochemistry, cell biology, molecular regulation, and metabolic engineering applications. Annual Review of Plant Physiology and Plant Molecular Biology  52, 29–66. Google Scholar Fiehn O. 2001. Combining genomics, metabolome analysis, and biochemical modelling to understand metabolic networks. Comparative and Functional Genomics  2, 155–168. Google Scholar Fiehn O. 2002. Metabolomics: the link between genotypes and phenotypes. Plant Molecular Biology  48, 155–171. Google Scholar Fiehn O. 2003. Metabolic networks of Cucurbita maxima phloem. Phytochemistry  62, 875–886. Google Scholar Fiehn O, Kloska S, Altmann T. 2001. Integrated studies on plant biology using multiparallel techniques. Current Opinion in Biotechnology  12, 82–86. Google Scholar Fiehn O, Kopka J, Doermann P, Altmann T, Trethewey RN, Willmitzer L. 2000a. Metabolite profiling for plant functional genomics. Nature Biotechnology  18, 1157–1161. Google Scholar Fiehn O, Kopka J, Trethewey RN, Willmitzer L. 2000b. Identification of uncommon plant metabolites based on calculation of elemental compositions using gas chromatography and quadrupole mass spectrometry. Analytical Chemistry  72, 3573–3580. Google Scholar Fiehn O, Weckwerth W. 2003. Deciphering metabolic networks. European Journal of Biochemistry  270, 579–588. Google Scholar Franke R, McMichael CM, Meyer K, Shirley AM, Cusumano JC, Chapple C. 2000. Modified lignin in tobacco and poplar plants over-expressing the Arabidopsis gene encoding ferulate 5-hydroxylase. The Plant Journal  22, 223–234. Google Scholar Frenzel T, Miller A, Engel K-H. 2002. Metabolite profiling: a fractionation method for analysis of major and minor compounds in rice grains. Cereal Chemistry  79, 215–221. Google Scholar Giovanelli J, Mudd SH, Datko AH. 1988. In vivo regulation of threonine and isoleucine biosynthesis in Lemna paucicostata Hegelm-6746. Plant Physiology  86, 369–377. Google Scholar Hrazdina G, Jensen RA. 1992. Spatial organizaton of enzymes in plant metabolic pathways. Annual Review of Plant Physiology and Plant Molecular Biology  43, 241–267. Google Scholar Humphreys JM, Chapple C. 2002. Rewriting the lignin roadmap. Current Opinion in Plant Biology  5, 224–229. Google Scholar Huntley SK, Ellis D, Gilbert M, Chapple C, Mansfield SD. 2003. Significant increases in pulping efficiency in C4H-F5H-transformed poplars: improved chemical savings and reduced environmental toxins. Journal of Agricultural and Food Chemistry  51, 6178–6183. Google Scholar Koch BM, Sibbesen O, Halkier BA, Svendsen I, Moller BL. 1995. The primary sequence of cytochrome P450tyr, the multifunctional N-hydroxylase catalyzing the conversion of L-tyrosine to p-hydroxyphenylacetaldehyde oxime in the biosynthesis of the cyanogenic glucoside dhurrin in Sorghum bicolor (L) Moench. Archives of Biochemistry and Biophysics  323, 177–186. Google Scholar Kolosova N, Miller B, Ralph S, Ellis BE, Douglas C, Ritland K, Bohlmann J. 2004. Isolation of high-quality RNA from gymnosperm and angiosperm trees. BioTechniques  36, 821–824. Google Scholar Li L, Zhou Y, Cheng X, Sun J, Marita JM, Ralph J, Chiang VL. 2003. Combinatorial modification of multiple lignin traits in trees through multigene cotransformation. Proceedings of the National Academy of Sciences, USA  100, 4939–4944. Google Scholar McCown BH, Lloyd G. 1981. Woody Plant Medium: a mineral nutrient formulation for micro culture of woody plant species. HortScience  16, 453. Google Scholar Morris CR, Scott JT, Chang H-M, Sederoff RR, O'Malley D, Kadla JF. 2004. Metabolic profiling: a new tool in the study of wood formation. Journal of Agricultural and Food Chemistry  52, 1427–1434. Google Scholar Nielsen NPV, Carstensen JM, Smedsgaard J. 1998. Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping. Journal of Chromatography  805, 17–35. Google Scholar Pedras MSC, Jha M, Ahiahonu PWK. 2003. The synthesis and biosynthesis of phytoalexins produced by cruciferous plants. Current Organic Chemistry  7, 1635–1647. Google Scholar Pilate G, Guiney E, Holt K, et al. 2002. Field and pulping performances of transgenic trees with altered lignification. Nature Biotechnology  20, 607–612. Google Scholar Rasmussen S, Dixon RA. 1999. Transgene-mediated and elicitor-induced perturbation of metabolic channeling at the entry point into the phenylpropanoid pathway. The Plant Cell  11, 1537–1551. Google Scholar Roessner U, Luedemann A, Brust D, Fiehn O, Linke T, Willmitzer L, Fernie AR. 2001a. Metabolic profiling allows comprehensive phenotyping of genetically or environmentally modified plant systems. The Plant Cell  13, 11–29. Google Scholar Roessner U, Willmitzer L, Fernie AR. 2001b. High-resolution metabolic phenotyping of genetically and environmentally diverse potato tuber systems. Identification of phenocopies. Plant Physiology  127, 749–764. Google Scholar Rogers LA, Campbell MM. 2004. The genetic control of lignin deposition during plant growth and development. New Phytologist  164, 17–30. Google Scholar Sakuta M, Hirano H, Komamine A. 1991. Stimulation by 2,4-dichlorophenoxyacetic acid of betacyanin accumulation in suspension-cultures of Phytolacca americana. Physiologia Plantarum  83, 154–158. Google Scholar Samuels AL, Rensing KH, Douglas CJ, Mansfield SD, Dharmawardhana DP, Ellis BE. 2002. Cellular machinery of wood production: differentiation of secondary xylem in Pinus contorta var. latifolia. Planta  216, 72–82. Google Scholar Srere PA. 1987. Complexes of sequential metabolic enzymes. Annual Review of Biochemistry  56, 89–124. Google Scholar Srere PA. 2000. Macromolecular interactions: tracing the roots. Trends in Biochemical Sciences  25, 150–153. Google Scholar Suarez MF, Avila C, Gallardo F, Canton FR, Garcia-Gutierrez A, Claros MG, Canovas FM. 2002. Molecular and enzymatic analysis of ammonium assimilation in woody plants. Journal of Experimental Botany  53, 891–904. Google Scholar Tabachnick BG, Fidell LS. 2001. Using multivariate statistics, 4th edn. Boston: Allyn & Bacon. Google Scholar Tolstikov VV, Fiehn O. 2002. Analysis of highly polar compounds of plant origin: combination of hydrophilic interaction chromatography and electrospray ion trap mass spectrometry. Analytical Biochemistry  301, 298–307. Google Scholar Whetten R, Sederoff R. 1995. Lignin biosynthesis. The Plant Cell  7, 1001–1013. Google Scholar Winkel-Shirley B. 1999. Evidence for enzyme complexes in the phenylpropanoid and flavonoid pathways. Physiologia Plantarum  107, 142–149. Google Scholar Zhao JM, Last RL. 1996. Coordinate regulation of the tryptophan biosynthetic pathway and indolic phytoalexin accumulation in Arabidopsis. The Plant Cell  8, 2235–2244. Google Scholar © The Author [2005]. Published by Oxford University Press [on behalf of the Society for Experimental Biology]. All rights reserved. For Permissions, please e-mail: journals.permissions@oxfordjournals.org TI - The potential of metabolite profiling as a selection tool for genotype discrimination in Populus JF - Journal of Experimental Botany DO - 10.1093/jxb/eri273 DA - 2005-09-05 UR - https://www.deepdyve.com/lp/oxford-university-press/the-potential-of-metabolite-profiling-as-a-selection-tool-for-genotype-Nw6VB0g0m2 SP - 2807 EP - 2819 VL - 56 IS - 421 DP - DeepDyve ER -