Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 7-Day Trial for You or Your Team.

Learn More →

Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis *

Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New... Technological Innovation and Resources © 2012 by The American Society for Biochemistry and Molecular Biology, Inc. This paper is available on line at http://www.mcponline.org Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and □ S Accurate Proteome Analysis* Ludovic C. Gillet‡, Pedro Navarro‡, Stephen Tate§, Hannes Ro¨ st‡, Nathalie Selevsek‡, Lukas Reiter‡¶, Ron Bonner§, and Ruedi Aebersold‡** Most proteomic studies use liquid chromatography cou- tation and targeted data extraction alleviates most con- pled to tandem mass spectrometry to identify and quan- straints of present proteomic methods and should be tify the peptides generated by the proteolysis of a biolog- equally applicable to the comprehensive analysis of other ical sample. However, with the current methods it remains classes of analytes, beyond proteomics. Molecular & challenging to rapidly, consistently, reproducibly, accu- Cellular Proteomics 11: 10.1074/mcp.O111.016717, 1–17, rately, and sensitively detect and quantify large fractions 2012. of proteomes across multiple samples. Here we present a new strategy that systematically queries sample sets for the presence and quantity of essentially any protein of Liquid chromatography coupled to tandem mass spec- interest. It consists of using the information available in trometry (LC-MS/MS) is considered the method of choice for fragment ion spectral libraries to mine the complete frag- the identification and quantification of proteins and pro- ment ion maps generated using a data-independent ac- quisition method. For this study, the data were acquired teomes (1–4) and for the analysis of metabolites, lipids, gly- on a fast, high resolution quadrupole-quadrupole time-of- cans, and many other types of (bio)molecules. For proteo- flight (TOF) instrument by repeatedly cycling through 32 mics, two main LC-MS/MS strategies have been used thus consecutive 25-Da precursor isolation windows (swaths). far. They have in common that the sample proteins are con- This SWATH MS acquisition setup generates, in a single verted by proteolysis into peptides, which are then separated sample injection, time-resolved fragment ion spectra for by (capillary) liquid chromatography. They differ in the mass all the analytes detectable within the 400–1200 m/z pre- spectrometric method used. The first and most widely used cursor range and the user-defined retention time window. strategy is known as shotgun or discovery proteomics. For We show that suitable combinations of fragment ions this method, the MS instrument is operated in data-depen- extracted from these data sets are sufficiently specific to dent acquisition (DDA) mode, where fragment ion (MS2) spec- confidently identify query peptides over a dynamic range tra for selected precursor ions detectable in a survey (MS1) of 4 orders of magnitude, even if the precursors of the queried peptides are not detectable in the survey scans. scan are generated (5). The resulting fragment ion spectra are We also show that queried peptides are quantified with a then assigned to their corresponding peptide sequences by consistency and accuracy comparable with that of se- sequence database searching (6, 7). The second main strat- lected reaction monitoring, the gold standard proteomic egy is referred to as targeted proteomics. There, the MS quantification method. Moreover, targeted data extrac- instrument is operated in selected reaction monitoring (SRM) tion enables ad libitum quantification refinement and dy- (also called multiple reaction monitoring) mode. With this namic extension of protein probing by iterative re-mining method, a sample is queried for the presence and quantity of of the once-and-forever acquired data sets. This combi- a limited set of peptides that have to be specified prior to data nation of unbiased, broad range precursor ion fragmen- acquisition. SRM does not require the explicit detection of the targeted precursors but proceeds by the acquisition, sequen- tially across the LC retention time domain, of predefined pairs From the ‡Department of Biology, Institute of Molecular Systems Biology, Eidgeno¨ ssische Technische Hochschule Zu¨ rich (ETH Zürich), of precursor and product ion masses, called transitions, sev- 8093 Zürich, Switzerland, §ABSciex, Concord, L4K 4V8 Ontario, Can- eral of which constitute a definitive assay for the detection of ada, ¶Biognosys AG, 8952 Schlieren, Switzerland, and the Faculty of Science, University of Zu¨ rich, 8057 Zu¨ rich, Switzerland Received December 19, 2011, and in revised form, January 18, The abbreviations used are: LC-MS/MS, liquid chromatography 2012 coupled to tandem mass spectrometry; DDA, data-dependent acqui- Published, MCP Papers in Press, January 18, 2012, DOI sition; DIA, data-independent acquisition; SRM, single reaction mon- 10.1074/mcp.O111.016717 itoring; RT, retention time; LOD, limit of detection. This is an open access article under the CC BY license. Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–1 Targeted Data Extraction for Proteome Analysis TABLE I LC time-resolved data-independent acquisition setups: description and current performance profiles The numbers reported here correspond to the percentages of peptides observable with at least five interference-free transitions in a yeast background, as reported in Fig. 2B. a peptide in a complex sample (8). Data analysis in targeted of concurrently fragmented precursors and therefore the proteomics essentially consists of computing the likelihood complexity of the acquired composite fragment ion spectra. that a group of transition signal traces are derived from the To date, the composite spectra generated by DIA methods targeted peptide (9). Both methods have different and largely have been principally analyzed with the standard database complementary preferred uses and performance profiles that searching tools developed for DDA, either by searching the have been extensively discussed elsewhere (10). Specifically, composite MS2 spectra directly (18, 20) or by searching shotgun proteomics is the method of choice for discovering pseudo MS2 spectra reconstituted postacquisition based on the maximal number of proteins from one or a few samples. It the co-elution profiles of precursor ions (from the survey does, however, have limited quantification capabilities on scans) and of their potentially corresponding fragment ions large sample sets because of stochastic and irreproducible (22, 25–28). precursor ion selection (11) and under-sampling (12). In con- Here, we report an alternative approach to proteome quan- trast, targeted proteomics is well suited for the reproducible tification that combines a high specificity DIA method with a detection and accurate quantification of sets of specific pro- novel targeted data extraction strategy to mine the resulting teins in many samples as is the case in biomarker or systems fragment ion data sets. For the data acquisition, we imple- biology studies (13–15). At present, however, the method is ment the sequential isolation window acquisition principle limited to the measurements of a few thousands transitions introduced by former DIA studies (18, 20) on a high resolution per LC-MS/MS run (16). It therefore lacks the throughput to MS instrument. This time- and mass-segmented acquisition routinely quantify large fractions of a proteome. method generates, in a single injection, fragment ion spectra To alleviate the limitations of either method, strategies have of all precursor ions within a user-defined precursor RT and been developed that rely on neither detection nor knowledge m/z space and records the ensemble of these fragment ion of the precursor ions to trigger acquisition of fragment ion spectra as complex fragment ion maps. Using computer sim- spectra. Those methods operate via unbiased “data-indepen- ulations we show that the resulting maps achieve the highest dent acquisition” (DIA), in the cyclic recording, throughout the fragment ion specificity of any DIA method described to date. LC time range, of consecutive survey scans and fragment ion We term this acquisition strategy “SWATH MS,” in reference spectra for all the precursors contained in predetermined to the swaths that are conceptually referred to designate the isolation windows. Various implementations of DIA methods series of isolation windows acquired for a given precursor have already been described using isolation windows of var- mass range across the LC. ious widths, ranging from the complete m/z range to few To analyze the high specificity, multiplexed data sets gen- Daltons (17–24) (Table I). Using such scans, the link between erated by SWATH MS, we developed a novel data analysis the fragment ions and the precursors from which they origi- strategy that fundamentally differs from the database search nate is lost, complicating the analysis of the acquired data approaches used so far to identify peptides from DIA data sets. Also, large selection window widths increase the number sets. It consists of using a targeted data extraction strategy to 10.1074/mcp.O111.016717–2 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis (TOF-MS) was collected, from which the top 20 ions were selected for query the acquired fragment ion maps for the presence and automated MS/MS in subsequent experiments where each MS/MS quantity of specific peptides of interest, using a priori infor- event consisted of a 50-ms scan. The selection criteria for parent ions mation contained in spectral libraries. Practically, the frag- included intensity, where ions had to be greater than 150 counts/s ment ion signals, their relative intensities, chromatographic with a charge state greater than 1 and were not present on the concurrence, and other information accessible from a spectral dynamic exclusion list. Once an ion had been fragmented by MS/MS, its mass and isotopes were excluded for a period of 15 s. Ions were library for each targeted peptide are used to mine the DIA isolated using a quadrupole resolution of 0.7 Da and fragmented in fragment ion maps for constellations of signals that precisely the collision cell using collision energy ramped from 15 to 45 eV within correlate with the known coordinates of a targeted peptide, the 50-ms accumulation time. In the instances where there were less thus uniquely identifying the peptide in the map. The extrac- than 20 parent ions that met the selection criteria, those ions that did tion of fragment ion traces from data-independently acquired were subjected to longer accumulation times to maintain a constant samples sets has been reported for the quantification of for- total cycle time of 1.25 s. For SWATH MS-based experiments, the mass spectrometer was merly identified peptides (18); however, this strategy has operated in a looped product ion mode. In this mode, the instrument never been purposely used to systematically search and iden- was specifically tuned to allow a quadrupole resolution of 25 Da/mass tify peptides from the fragment ion maps of DIA data sets. selection. The stability of the mass selection was maintained by the Indeed, it is only with the increasing availability of proteome- operation of the Radio Frequency (RF) and Direct Current (DC) volt- wide spectral libraries that this targeted data extraction strat- ages on the isolation quadrupole in an independent manner. Using an isolation width of 26 Da (25 Da of optimal ion transmission effi- egy becomes largely applicable to mine the acquired data ciency  1 Da for the window overlap), a set of 32 overlapping sets for peptides never identified thus far with regular shotgun windows was constructed covering the mass range 400–1200 Da. proteomics approaches. Consecutive swaths need to be acquired with some precursor isola- We show that the combination of high specificity fragment tion window overlap to ensure the transfer of the complete isotopic ion maps and targeted data analysis using information from pattern of any given precursor ion in at least one isolation window and spectral libraries of complete organisms offers unprece- thereby to maintain optimal correlation between parent and fragment isotopes peaks at any LC time point (supplemental Fig. S1, a–f). This dented possibilities for the qualitative and quantitative prob- overlap was reduced to a minimum of 1 Da, which experimentally ing of proteomes. This approach should be applicable be- matched the almost squared shape of the fragment ion transmission yond proteomics to other “omics” measurements, including profile achieved through the specific quadrupole tuning developed for metabolomics and lipidomics, or to forensics or biomedical SWATH MS (supplemental Fig. S1, g and h). The windows setups analytics fields, which require accurate quantitative analysis used for these runs were as follows: Experiment 1: MS1 scan (see below); Experiment 2: 400–426; Experiment 3: 425–451… Experi- of as many analytes as possible from a LC-MS/MS single ment 33: 1175–1201. Those isolation windows of 26-Da width sample injection. (25 Da  1 Da) are the “nominal” windows used to compute the RF/DC voltages used to drive the isolation quadrupole during the MATERIALS AND METHODS acquisition. However, because the isolation windows are only “almost TM square shapes” (supplemental Fig S1, g and h), 0.3–0.5 Da of ion LC-MS Sample Acquisition—A commercial 5600 TripleTOF transmission can be estimated as being “lost” on either sides of the (ABSciex, Concord, Canada) was used for all the experiments. The windows. The “100% efficient” transmission of precursor ions is instrument was coupled with an Eksigent 1D Nano LC system therefore happening only for 25 Da effectively. In other words, the (Eksigent, Dublin, CA) for the stable isotope dilution experiments or “effective” isolation windows can be considered as being 400.5– with an Eksigent NanoLC-2DPlus with nanoFlex cHiPLC system for 425.5, 425.5–450.5, etc. (plus the potential overlap left from the the diauxic shift sample acquisition. The same solvents were used on nominal window transmission). The collision energy for each window both LC systems, with solvent A being composed of 0.1% (v/v) formic was determined based on the appropriate collision energy for a 2 acid in water and solvent B comprising 95% (v/v) acetonitrile with ion centered upon the window with a spread of 15 eV. This ensured 0.1% (v/v) formic acid. The serial dilution experiments were per- optimal fragmentation for the broad range of precursors co-selected formed with a customer-packed emitter, which was created using a within the isolation windows. An accumulation time of 100 ms was laser puller to an orifice of 4 m and packed with 3-m Zorbax C18 used for each fragment ion scan and for the (optional) survey scans column using a pressure bomb. The samples were loaded directly acquired at the beginning of each cycle. This results in a total duty onto this column from the nano LC system at a flow rate of 500 cycle of 3.3 s (3.2 s total for stepping through the 32 isolation nlmin . The loaded material was eluted from this column in a linear windows  0.1 s for the optional survey scan). The mass resolution gradient of 5% solvent B to 30% solvent B over 90 min. The column was between 15,000 and 30,000 for the MS/MS scans, depending on was regenerated by washing at 90% solvent B for 10 min and re- the mode used to record the SWATH MS data sets (high sensitivity or equilibrated at 5% solvent B for 10 min. The diauxic shift sample high resolution). For this study, the high sensitivity mode was used, acquisitions were performed using a “trap and elute” configuration on the nanoFlex system. The trap column (200 m  0.5 mm) and the which still allows accurate extraction of the fragment ion masses at analytical column (75 m  15 cm) were packed with 3 m ChromXP 10–50 ppm accuracy (optimal extraction for the area under curve of C18 medium. The samples were loaded at a flow rate of 2 lmin for the MS/MS profile signals at half peak width). 10 min and eluted from the analytical column at a flow rate of 300 Fragment Ion Interference Simulations—To generate the back- nlmin in a linear gradient of 5% solvent B to 35% solvent B in 155 ground for our simulations, the Saccharomyces cerevisiae protein min. The column was regenerated by washing at 80% solvent B for 10 sequences were downloaded from ensembl.org (release 57_1j). The min and re-equilibrated at 5% solvent B for 10 min. peptide set resulting from trypsin proteolysis (no missed cleavages) For standard data-dependent analysis experiments, the mass was generated in silico using carbamidomethyl cysteine as fixed spectrometer was operated in a manner where a 250-ms survey scan modification. We then selected the peptides with theoretical precur- Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–3 Targeted Data Extraction for Proteome Analysis sor ion charge states 2 and 3 and with the monoisotopic and the 91 amol total amount loaded on column for a 2-l injection) in 2-fold first C isotopic masses (0 and 1 Da) within the mass range of steps. The reference peptides were spiked in a 1 g/l concentrated 15 15 400 to 1,200 m/z. For each of those precursor ions, the theoretical set N-labeled yeast tryptic digest as proteomic background ( N-la- of fragment ions was generated (all b and y ions of charge 1 and beled yeast trypsin digest was used as proteomic background to 2), giving rise to transition pairs. This data set contained 111,880 avoid the interference between the b fragment ions from endogenous peptides (corresponding to 6,557 proteins) resulting in 194,314 dou- N yeast peptides and those from the reference peptides). The bly and triply charged precursors (388,781 overall, taking into account peptides kept at the constant concentration through all the samples the monoisotope and first C isotope) and in 10,004,504 transitions were used to estimate the coefficient of variance throughout the altogether that constituted thus the background of our simulations. experiment, whereas the other 38 diluted peptides were used to We also prepared a reduced data set that only contained the precur- estimate the limit of detection and limit of quantification of the SWATH sors of peptides that were reported in the PeptideAtlas (Yeast MS acquisition method. The raw values (nondenoised, nonsmoothed) PeptideAtlas 200904 build, also containing the MS-identified modifi- of the extracted peak areas for each of the fragment ions of the cations and nontryptic peptides). This reduced data set contained peptides in each of the samples are provided in the supplemental 48,087 peptides (corresponding to 3,898 proteins), resulting in 93,875 Table 1. For the limit of detection and quantification, a peak group doubly and triply charged precursors (187,777 overall, taking into composed of the three extracted fragment ion traces as well as a account the monoisotope and first C isotope) and in 5,476,964 signal to noise ratio above 3 (respectively 10) for the considered peak transitions altogether that we used as a more realistic proteomic were required. For the comparison with the MS1/label-free based background. The retention times of the peptides were computed analysis, the precursor ion traces were extracted from the survey using the SSRCalc algorithm (50). To estimate the number of SRM scans of the SWATH MS acquisition. Because the accumulation time interferences, we generated in silico query assays for all proteotypic in SWATH MS mode was of 100 ms for each fragment ion scan and peptides of the yeast genome as targets. We considered all singly for the survey scans, the peak areas of the fragment ion traces charged b and y transitions of the monoisotopic 2 precursor of (extracted from their swath) and of the precursor ion traces (extracted those peptides as targets and ran them against the computed back- from the survey scan) were directly and fairly comparable. Increasing grounds (theoretical yeast digest or PeptideAtlas) and recorded an the acquisition time for the survey scan could have increased the interference whenever a transition from the background (that did not signal of the precursors of interest but would have also equally belong to the query peptide) was within a specified distance of Q1, increased the overall noise/interference signals, therefore not really Q3, and RT from the target queried peptide. For each target peptide, affecting the detection/quantification of those (supplemental Fig. S7, the number of transitions that were interfered with was recorded and a5–a8). Also, 100-ms acquisition time for MS1 scans is anyway in the later used to compute the statistics. The detailed algorithm for the range of what may be experimentally used during a typical shotgun computation of the product ion interferences will be the subject of a experiment aiming at high numbers of precursor ion selection and separate study (51). This algorithm essentially expands on the prin- identification while still allowing for MS1 label-free quantification (48). ciple of the “unique ion signatures” described by Sherman et al. (29) For the comparison with shotgun analysis, the same serial dilution by taking into account peptide RT as an additional constraint for the samples that had been acquired by SWATH MS were reacquired with calculation of fragment ion interferences. It should be acknowledged a “top 20” DDA method and searched with Mascot for the peptides that the current algorithm does not simulate the peptide signal inten- identification. The peptides used for the intrascan dynamic range sities. Even if, in theory, the different MS response factors of those experiment have already been used in a different context in our peptides could be retrieved from, for example, the PeptideAtlas da- laboratory (49). They consist of the two following sequences: 13 15 tabase, it is unlikely that those response factors can be extrapolated AADITSL*YK* (where L* indicates C , N, and K* indicates 13 15 from one sample to another or from one study to the other, because C , N ) for the doubly isotopically labeled peptide 1 and AADIT- 6 2 13 15 of ion suppression effects during the ionization. However, it was not SLYK* (where K* indicates C , N ) for the singly isotopically la- 6 2 the aim of these theoretical simulations to perfectly depict the reality, beled peptide 2. For the intrascan dynamic range experiment, the but rather to give an impression about the overall ranking of the singly isotopically labeled peptide 2 was kept as a constant concen- different Q1/Q3 scenarios. In this respect, the simulations are valid tration of 625 fmol/l (1.25 fmol total amount loaded on column for a because upon increasing the background for the fragment ion simu- 2-l injection) in all the samples, whereas the doubly isotopically lations (from 93,875 to 194,314 precursors), the overall ranking of the labeled peptide 1 was serially diluted from 625 fmol/l to 305 amol/l scenarios is maintained. This means that those simulations may not (from 1.25 pmol to 610 amol total amount loaded on column for a capture the exact reality but can be perfectly used as a tool to 2-l injection) in 2-fold steps. Those peptides were spiked in a 1 g/l compare the extent of fragment ion interference for different scenar- concentrated yeast tryptic digest as proteomic background. The ios, exactly as it is applied here. Finally, a recent study has experi- samples were acquired by SWATH MS. The fragment ion traces of the mentally quantified the extent of fragment ion interference in a human three most intense fragments were extracted from their swath (475– cell lysate (12), and the results are in good agreement with our 500 m/z in this case), and the precursor ion traces were extracted simulations overall. from the survey scans from the SWATH MS data sets. The raw values Serial Dilution Samples for the Limit of Detection, Limit of Quanti- (nondenoised and nonsmoothed) of those extracted peak areas in fication, and Intrascan Dynamic Range Assessment—All of the isoto- each of the samples are provided in the supplemental Table 2. The pically labeled reference peptides were ordered from Thermo or fragment ion traces and MS/MS spectra around the y7 fragment at Sigma-Aldrich with amino acid analysis-certified concentrations. The the RT of the peptides elution are provided as supplemental Fig. S6. sequences of the 61 peptides used for the limit of detection experi- Data Analysis—As for SRM targeted acquisition, the three to five ment, as well as their precursor ion masses (defining the swath in most intense fragments (of proteotypic peptides, as reported in the which to extract the daughter ions) and fragment ion masses (used to spectral libraries) were typically selected to perform the targeted data extract the fragment ion chromatograms), are provided in supplemen- analysis of the SWATH MS data sets. Because those MS/MS spectra tal Table 1. From those 61 peptides, 23 were kept at a constant libraries were usually generated on low resolution instruments (i.e., concentration of 23.5 fmol/l (47 fmol total amount loaded on column triple quadrupoles or ion traps), the high mass accuracy value of fora2-l injection) in all the samples, whereas the other 38 peptides those fragments was recalculated theoretically based on the amino l to 45 amol/l (from 47 fmol to were serially diluted from 23.5 fmol/ acid sequence of the peptide. Those high mass accuracy fragment 10.1074/mcp.O111.016717–4 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis ion masses were then used as seeds for extracting ion chromato- of (i) a lysate of yeast cells grown in regular N medium and sampled grams in the right LC-MS/MS swath map (indicated by the precursor throughout the metabolic shift from fermentation to respiration spiked ion mass). In the case of “borderline” peptides (i.e., peptides with with (ii) a constant N-labeled yeast lysate background used as an precursor mass falling at an edge of an isolation window or in the zone internal standard for the fold change calculations. For our analysis, we of isolation window overlap), the fragment ion traces were extracted reacquired, in SWATH MS mode, samples 1 and 8 prepared for that in the swath with borders furthest away from the precursor ion mass study, which constitute the start and end points of the diauxic shift and/or containing most of the isotopic distribution of the precursor (15). The biological triplicates of samples 1 and 8, prepared for the ion. All of the fragment ion chromatograms were extracted and au- SRM study, were pooled for the SWATH MS reacquisition to reach enough volume for the sample injection. However, this pooling pre- tomatically integrated with PeakView (v. 1.1.0.0). The raw peak areas vented us from comparing the standard deviations directly between as reported by PeakView were used for all the quantification calcu- the two studies (the SRM analysis contained the biological sample lations with no data processing (neither denoising nor smoothing, preparation variability from the triplicate analysis). Therefore, we de- etc.) of any kind applied to the extracted ion chromatograms. cided not to present the error bars corresponding to the SRM and To assess the detection of a peptide, we used as a first pass the LC SWATH MS standard deviations in Fig. 4B, because they actually validation criteria for the extracted fragment ions traces suggested by captured different information. The error bars were, however, re- Reiter et al. (9): co-elution, peak shape similarity, correlation of the ported individually for the SRM and the SWATH MS study in sepa- relative intensities with reference spectra, correlation of the relative rated plots provided in supplemental Table 5 for the reader to appre- intensities with those of a spike-in reference peptide, co-elution with ciate the low standard deviations achieved by the SWATH MS spiked-in reference, and peak shape similarity with spiked-in refer- quantification. The SWATH MS data analysis consisted of extracting ence. The peptide retention time (predicted, e.g., with SSRCalc or the fragment ion traces of the precursors from the corresponding experimental when available in spectral libraries) may be used to swath and in reporting the peak areas automatically integrated by reduce the chromatographic space where to look for the targeted PeakView for those extracted chromatograms. To not introduce any peak group (similarly to scheduled versus nonscheduled SRM). This is bias in the data analysis, the calculation of the abundance fold not absolutely necessary for intense signals, but the gain in identifi- changes of the proteins was exactly copied from that of the SRM cation validation can be important for lower intensity signals (or for study (15). In short, for each peptide transition, the ratio of the light noisy fragment ion signals over the LC space). To anticipate when a over heavy peak areas was calculated individually for each sample peptide is expected to elute, a simple retention time re-alignment for (samples 1 and 8); the abundance fold change for each transition each gradient or column can be performed to recalculate the retention was then calculated by dividing each transition ratio of sample 8 by time relative to its RT available in the spectral library. This is typically the corresponding ratio of sample 1; the final abundance fold done by using a set of reference peptides (relatively to which the change of a protein was then calculated by averaging the individual retention time of the peptides of the spectra library were recorded), abundance fold change of each of its transitions. The raw values which are spiked into each sample prior to the SWATH MS measure- (nondenoised and nonsmoothed) of those extracted peak areas in ment. Those peptides are used to recalibrate the retention times for each of the samples are provided in supplemental Tables 4 and each specific run and to help to restrict the extraction of peptides 6–8. from the library to a reasonable elution time window in each SWATH To query the SWATH MS data sets for the 60 yeast mitochondrial MS run. In contrast to SRM measurements, the inclusion of such proteins involved in oxidative phosphorylation of the respiratory chain reference peptides does therefore not consume additional data ac- (as listed in the Kyoto Encyclopedia of Genes and Genomes website, quisition time. http://www.genome.jp/kegg/), we used 287 proteotypic peptides as- Finally, several SWATH MS-specific additional criteria may be used says that were available in our spectral libraries. The relatively low to confirm (or invalidate) the peptide identification: confirmation that success rate for the identification of these mitochondrial proteins (36 the extracted fragment ions correspond to monoisotopic signals and proteins identified over 60) can be explained by the fact that the verification of the charge state of those fragments in the full MS/MS protocol used for the preparation of those samples was at that time spectra extracted from the swath around the apex of the candidate originally devised for the quantification of the metabolic enzymes peak group, assessment of the mass accuracy (typically 5–10 ppm) analyzed by SRM (15) and was therefore not optimized for the recov- of the extracted fragment ions, etc. A step-by-step tutorial describing ery of membrane mitochondrial proteins. the manual or automated targeted data analysis of SWATH MS data The raw (.wiff) files of the diauxic shift samples 1 and 8 acquired in sets is provided in the supplemental materials. SWATH MS mode may be downloaded from ProteomeCommons.org Database Search of the Reference Peptides of the Dilution Series— Tranche using the following hash code: He8q40Zqudc27nm The Mascot database search analysis for the reference peptide of the V1fUpqMhPmhVzVVlYqDMNFKcI9dVSZzGkInuXjK9Mg7iBexSZ6eY dilution series was performed on Mascot v. 2.4 with a self-compiled mGCRkYYp5TgSD6FTqcC5qW2sAAAAAAAAGCg. database comprising the 61 reference peptides grouped in three artificial proteins based on their abundances as reported in the Weiss- RESULTS man list (32). The enzyme selected was trypsin with no missed cleav- We describe a new concept for the accurate, reproducible, age. A search tolerance of 50 ppm was specified for the peptide mass tolerance and 0.05 Da for the MS/MS tolerance. The charges of the high throughput identification and quantification of proteomes peptides to search for were set to 2,3, and 4. The search was by mass spectrometry. It combines a high specificity data- set on monoisotopic mass. The instrument was set to ESI-QUAD- independent LC-MS/MS acquisition method with a targeted TOF. The following modifications were specified for the search: car- data extraction and analysis strategy. bamidomethyl cysteines as fixed modification, C-terminal heavy ly- sine, C-terminal heavy arginine, and oxidized methionine as variable Data-independent Data Acquisition modification. Diauxic Shift Samples for the Quantification Accuracy Assess- The acquisition method essentially extends the DIA ap- ment—The diauxic shift samples used in our experiments were the proach initially described by Venable et al. (18). It consists of same than those analyzed by SRM in an earlier study (15). The sample recording consecutive high resolution fragment ion spectral set prepared for that study consisted in the tryptic digest of a mixture Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–5 Targeted Data Extraction for Proteome Analysis FIG.1. SWATH MS data-independent acquisition and targeted data analysis. A, the data-independent acquisition method consists of the consecutive acquisition of high resolution, accurate mass fragment ion spectra during the entire chromatographic elution (retention time) range by repeatedly stepping through 32 discrete precursor isolation windows of 25-Da width (black double arrows) across the 400–1200 m/z range. The series of isolation windows acquired for a given precursor mass range and across the LC is referred to as a “swath” (e.g., series of 10.1074/mcp.O111.016717–6 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis spectra of all precursors within a user-defined precursor ion tion of the fragment ion signals (supplemental Fig. S2). Using window. This is achieved by stepping the precursor isolation computer simulations, we assessed whether the signals in the window of a quadrupole-quadrupole TOF instrument in 25-Da complex fragment ion maps acquired by SWATH MS were increments (defining the swath width) recursively during the sufficiently specific to support conclusive identification and entire LC separation (Fig. 1A). At 100-ms accumulation time quantification of peptides. As a benchmark, we used the per swath, the quadrupole-accessible 400–1200 m/z range is specificity and accuracy levels of SRM, the gold standard MS covered in 32 steps for a total cycle time of 3.2 s, which is quantification method. With the tool “SRM-Collider,” we sufficient to reconstruct the 30-s chromatographic peak of computed the occurrence of fragment ion interferences for each analyte for accurate quantification. The data structure various combinations of precursor isolation window width and can thus be conceptualized as 32 successive MS2 maps fragment ion mass accuracy. This tool extends the principle of consisting of the composite fragment ion spectra from all the the unique ion signatures described by Sherman et al. (29) by analytes fragmented in each swath (Fig. 1B). Similar to other taking into account peptide RT as an additional constraint windowed DIA methods (20, 23), consecutive swaths were for the calculation of fragment ion interferences. As the basis acquired with some precursor isolation window overlap to for the simulations, we computed theoretical fragment ion ensure the transfer of the complete isotopic pattern of any spectra for 93,875 doubly and triply charged precursors cor- given precursor ion in at least one isolation window and to responding to the tryptic peptides of 3,898 yeast proteins thereby maintain optimal correlation between parent and frag- reported in the PeptideAtlas database (www.peptideatlas. ment isotopes peaks at any LC time point (supplemental Fig. org). Those represent essentially the complete yeast pro- S1, a–f). This overlap was reduced here to a mere minimum of teome observable by mass spectrometry (30) and constitute 1 Da. This value experimentally matched the almost square therefore a realistic proteomic background. Cumulative plots shape of the fragment ion transmission profile (supplemental depicting the percentage of peptides observable with a given Fig. S1, g and h), which was achieved through specific qua- number of interference-free transitions as a measure for cor- drupole tuning purposely developed for SWATH MS. Finally, rect peptide identification and quantification are shown in Fig. to ensure optimal fragmentation for the broad range of pre- 2A for SRM (0.7- and 1-Da isolation widths for precursor and cursors co-selected within each isolation window, a 15 eV fragment ions, respectively) and SWATH MS (25-Da swath ramping of collision energy was used, centered around the width, 10-ppm fragment ion accuracy) scenarios. A histogram optimal collision energy required to fragment a doubly representing the percentage of peptides with five or more charged precursor centered in the middle of the isolation interference-free transitions is shown in Fig. 2B. Both figures window. show that SWATH MS provides a fragment ion specificity that Like other DIA methods (17–24), SWATH MS performance is comparable with that achieved with standard SRM setups. is directly impacted by the width of the precursor isolation Because the extensive shotgun data sets from yeast pro- window. In principle, large isolation windows are preferable to teome mapping studies possibly underestimate the complex- cycle through a wider precursor mass range with faster cy- ity of real samples, we compared the specificity of SWATH cling rates or with increased dwell times. However, large MS fragment ion maps with the specificity achievable by SRM isolation widths increase the number of precursors concur- in a more complex background. The simulations were re- rently fragmented in the respective window, increasing the peated by including all of the doubly and triply charged pre- likelihood of overlap of fragment ions from different precur- cursors (194,314 precursors, corresponding to 6,557 proteins sors (fragment ion interference). The rate of fragment ion (data from ensembl.org) of a complete in silico yeast tryptic interference also depends on the mass accuracy and resolu- digest. As expected, the extent of fragment ion interferences the red double arrows). The cycle time is defined as the time required to return to the acquisition of the same precursor isolation window. Note that the dotted line before the beginning of each cycle depicts the optional acquisition of a high resolution, accurate mass survey (MS1) scan. B, representation of the actual data acquired in one swath (450–475 m/z range) shown here as an MS2 map, with retention time as the abscissa, fragment ion m/z as the ordinate, and ion intensity represented by color intensity. The darker horizontal band visible between 450 and 475 m/z corresponds to residual precursor ions for this swath. The signals co-eluting in the vertical direction are likely fragment ions originating from the same precursor ion. C, the targeted data analysis consists of retrieving the most intense fragment ions of a peptide of interest from a spectral library (list of fragment masses for the N-labeled peptide WIQDADALFGER or the corresponding C-terminal isotopically labeled reference) and extracting those fragment ion traces in the appropriate 700–725 swath using a narrow m/z window (e.g., 10 ppm). These fragment ion traces can be plotted as overlaid extracted ion chromatograms, similarly to SRM transitions. The peak group displaying the best co-eluting characteristics and matching best to the peak group of extracted reference fragment ion traces identifies and quantifies the target peptide. D, the complete high resolution, accurate mass fragment ion spectra underlying the best candidate peak group can be extracted from the raw data. These spectra can be inspected to confirm that the extracted signals originate from mass accurate monoisotopic fragment ion with the right charge state (e.g., lower panel zooms on the y4 (green box) and y10 (blue box) fragment, with the endogenous and reference peptide fragments annotated with open or closed circles, respectively). They can also be extensively annotated to strengthen the identification of the peptide (top panel). Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–7 Targeted Data Extraction for Proteome Analysis FIG.2. Simulated fragment ion interferences for various LC-MS/MS acquisition scenarios. A, fragment ion interference cumulative plots are computed as described under “Materials and Methods” by taking into account fragments ions from doubly charged yeast tryptic peptide precursors against the fragment ions from the doubly and triply charged yeast tryptic peptides reported in PeptideAtlas (www.peptideatlas. org). The distribution of peptides with specific numbers of interference-free transitions are shown for the following simulations (precursor and fragment ion isolation respectively): 0.7 Da/0.7 Da (open diamonds), 25 Da/10 ppm (black squares), 1 Da/1 Da (open triangles), 2.5 Da/1 Da (crosses), 10 Da/1 Da (asterisks), and 800 Da/10 ppm (open circles). Simulation plots for other background or acquisition scenarios are available in supplemental Fig. S3. B, the fraction of peptides observable with five or more interference-free transitions for the various acquisition scenarios is presented in the histogram with white bars. Accordingly, the shaded bars represent the fraction of peptides having less than four interference-free transitions. with this more complex background was higher for the differ- Da-wide swath and 20–30-s RT elution segment of that pre- ent scenarios. However, the relative specificity offered by cursor. Using the 93,875 yeast tryptic precursors from the SWATH MS versus SRM remained qualitatively the same PeptideAtlas database, the simulations indicated that, for (supplemental Fig. S3). 75% of the peptides, more than 20 additional precursors As a comparison, we checked whether previous DIA meth- (median  40) were expected to be present in the specified ods would also provide sufficient fragment ion specificity to window (supplemental Fig. S4). These numbers illustrate the support the identification of peptides using a targeting data extent of precursor co-selection, and by inference, the frag- analysis strategy. We simulated the fragment ion interfer- ment ion spectral complexity that is generated when wide ences for various sequential windowed DIA methods on low isolation windows are used. These simulations suggest that resolution instruments (scenarios with 2.5-Da/1-Da, or 10-Da/ analyzing such data sets with traditional DIA database search 1-Da swath width and fragment ion accuracy, respectively) or strategies remains highly challenging. for DIA methods on high resolution instruments without iso- To analyze the SWATH MS data sets, we therefore imple- lation window (scenario with 800-Da swath width and 10-ppm mented a data mining strategy that is conceptually similar to fragment ion accuracy). Fig. 2 and supplemental Fig. S3 show targeted mass spectrometry by SRM. However, in contrast to that none of the former DIA methods are able to reach the SRM, the signals used for peptides identification and quanti- level of fragment ion specificity of SRM or SWATH MS and are fication are specified postacquisition and can therefore be therefore not amenable to accurate targeted data mining with- flexibly adapted or optimized. The data analysis strategy is out prior raw data filtering. schematically illustrated in supplemental Fig. S5. The pro- cess starts by selecting, from reference spectral libraries such Targeted Data Analysis of SWATH MS Fragment Ion as SRMAtlas (31), a suitable set of fragment ions from pep- Maps tides proteotypic for the proteins of interest. In SRM, those fragment ion masses are transition coordinates for the tar- Using the same rationale used above for the simulation of fragment ion interferences in MS2 maps, we computed the geted acquisition. In SWATH MS, those fragment ion masses overall precursor ion distribution in the LC-MS1 space. For are used to extract ion chromatograms from the acquired data this, we counted for each precursor the number of doubly and sets that are then combined into an identifying peak group. triply charged peptides concurrently coinciding within the 25- Fig. 1C provides an example of ion traces for the four most 10.1074/mcp.O111.016717–8 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis FIG.3. Limit of detection and intrascan dynamic range. A, the areas (y axis) of the precursor ion extracted from the survey scan (open squares) and of the most intense fragment ion extracted from the SWATH MS (closed triangles) and SRM (black crosses) quantifications are shown for the different serial dilution experiments (injected amounts of the peptide ELGQSGVDTYLQTK diluted in a yeast tryptic background in the x axis). The Mascot scores of the peptide identified in the same dilution series samples but acquired in DDA mode are shown as open circles. The limits of detection for the different methods are indicated with dotted lines. The complete series of LOD plots and corresponding lists of peak areas for the precursor and fragment ion traces quantified during these dilution series experiments are provided in supplemental Table 1 for the full set of 61 reference peptides. B, similar quantification plot for the doubly isotopically labeled peptide AADITSLYK serially diluted in a yeast tryptic background is shown here for the most intense fragment ion with closed triangles (“LOD control”). The intrascan dynamic range experiment consists of a dilution series of the same peptide AADITSLYK (open squares, “intrascan diluted”) in the presence of a constant amount of a singly isotopically labeled peptide AADITSLYK (open diamonds, “intrascan constant”), in the same yeast tryptic background. The complete lists of peak areas for the precursor and fragment ion traces quantified during the dilution series and intrascan dynamic range experiments are provided in supplemental Table 2. Screenshots of the quantified fragment ion traces and of the MS/MS spectra (zoomed around the y7 fragment) underlying the peptide peak apex are provided in supplemental Fig. S6 for the sample sets of the intrascan dynamic range experiment. intense fragments of the endogenous peptide WIQDADALF- Performance of SWATH MS Coupled to Targeted Data GER that is proteotypic for yeast protein RIR2. The protein Extraction has an expected abundance of 500 copies per cell (32). The Limit of Detection, Limit of Quantification, and Intrascan traces were extracted from a N-labeled yeast tryptic digest Dynamic Range—The LOD of the method was assessed by data set acquired by SWATH MS, specifically in the swath measuring dilution series of 61 reference peptides containing 700–725 that contained the 719.318 m/z doubly charged isotopically labeled lysine or arginine C termini, spiked into a precursor. The data show that around the RT of 53.9 min, the background of a N-labeled yeast tryptic digest. Among extracted ion chromatograms form a peak group that identi- those, 38 peptides were serially diluted, covering a range of fies the queried peptide, based on the same criteria com- 47 fmol to 91 amol, and 23 were kept constant at 47 fmol monly used by automated SRM analysis tools (e.g., mProphet each. The samples were subjected to SWATH MS acquisition, (9) or Skyline (33)) such as co-elution of the fragment ions and the ion traces for the three most intense fragment ions for traces, correlation of the relative fragment ion intensities with each reference peptides were extracted and integrated. The those of reference spectra, and more. The identification can resulting dilution plots show a limit of detection (signal to noise be further strengthened by checking the co-elution with a ratio above 3) and a limit of quantification (deviation from line- reference peptide spiked into the sample (Fig. 1C)orby arity above 30%) in the amol range for eight of the diluted extensively annotating the full fragment ion spectra implicitly peptides (Fig. 3A and supplemental Table 1). The coefficient of present in the SWATH MS data at that RT (Fig. 1D). As in variance was estimated as 13.7% for the peptides spiked at SRM, the quantification is intrinsically linked to the peptide constant concentrations (supplemental Table 1). identification (supplemental Fig. S5) and proceeds by integra- Next, we determined the intrascan dynamic range of the tion of the fragment ions traces across the chromatographic method, i.e., the fold change range separating the highest and elution of the validated peak group, with the optional use of isotopically labeled references for relative or absolute lowest signal intensities concurrently observable within a quantification. same fragment ion spectrum. For this, an isotopically labeled Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–9 Targeted Data Extraction for Proteome Analysis peptide pair was chosen such that both (co-eluting) precur- sensitive triple-quadrupole instrument operating in SRM sors were co-selected within the same swath. The samples mode (supplemental Table 3). This comparison showed that consisted of a yeast tryptic digest spiked with one peptide at SRM was 10-fold more sensitive, placing SWATH MS cou- a constant amount of 1.25 pmol loaded on column, whereas pled to targeted data extraction between SRM and MS1/ the isotopic counterpart was diluted in a stepwise manner label-free quantification workflows in terms of sensitivity. (supplemental Table 2 and supplemental Fig. S6). The data Quantification Accuracy of SWATH MS-targeted Data Anal- were acquired in SWATH MS mode and analyzed as de- ysis—Next, we sought to benchmark the quantification accu- scribed above. Fig. 3B shows that the diluted peptide species racy of SWATH MS targeted analysis to that of SRM, the gold could be detected and quantified linearly through a dynamic standard mass spectrometric quantification method. For this, range of almost 4 orders of magnitude. It is noteworthy that we reacquired, via SWATH MS, samples 1 and 8 correspond- the signal did not demonstrate saturation even at the highest ing to the start and end points of a yeast diauxic shift exper- peptide concentration, suggesting that dynamic range could iment previously analyzed by SRM (15). Those samples con- be further extended by using higher peptide concentrations. sisted of tryptic digests of a mixture of (i) a lysate of yeast cells Thus, the sensitivity of the method seems so far limited by the grown in regular N medium and sampled throughout the chemical or electronic noise of the measurement itself rather metabolic shift from fermentation to respiration and (ii) a con- than by intrascan dynamic range considerations. stant N-labeled yeast lysate as internal standard for the fold We then compared the performance of SWATH MS with change calculations. As a first pass analysis, the SWATH MS that of other standard proteomic strategies. For the compar- data set was mined with the exact same set of 476 transitions ison with DDA, the LOD dilution series samples described used to quantify the fold change of 80 peptides (45 metabolic above were analyzed on the same MS instrument running in enzymes) in the SRM study (15). From this initial data mining, “top 20” shotgun mode. The data were searched with Mascot, mProphet automated analysis could validate 64 peptide iden- and the identification score for the 61 reference peptides was tifications (1.5% false discovery rate; supplemental Fig. S8). reported on the same plots as those from the SWATH MS- Upon visual inspection of the extracted fragment ion traces, extracted fragment intensities (supplemental Table 1). The we could confirm the quantification for 40 proteins (72 pep- results indicate that, for 26 of the 38 diluted peptides, the tides), whereas 5 proteins (8 peptides) were not convincingly database searches failed to identify the reference peptides detectable with this initial set of transitions (supplemental even when those were spiked at concentrations that were Tables 4 and 5). 2–10 fold higher than the respective LOD in the SWATH MS Unlike SRM data, SWATH MS data sets contain transition data sets. It is noteworthy that all the missing peptide identi- signals different from those originally extracted and fragmen- fications were actually due to nonselected signals for MS/MS tation information for other peptides than those originally tar- sequencing. This experimentally demonstrates that precursor geted. Taking advantage of this, we re-extracted, from the ion detection/picking in the MS1 scans is less reliable than exact same two files, additional or alternative peptide frag- fragment ion signal extraction from the MS/MS scans. ment ion traces for proteins whose identification and/or quan- To compare the performance of SWATH MS with that of tification was compromised because of fragment ion interfer- label-free workflows, we integrated the precursor ion traces ences or low signal to noise ratio during the primary data extracted from the MS1 scans present in the exact same set extraction. This straightforward data reanalysis rescued quan- of files acquired by SWATH MS for the dilution series sam- tification information for three of the five undetected proteins, ples. For the acquisition of this data set, a survey scan was by quantifying nine novel peptides (supplemental Table 6) and carried out at the beginning of each swath cycle using the significantly improved the quantification accuracy for the same periodicity (3.2 s) and accumulation time (100 ms) also three proteins displaying the highest standard deviations in applied per swath window (Fig. 1A), thus providing the closest the primary analysis (supplemental Table 7). Fig. 4A summa- quantification comparison possible. The MS1 areas were re- rizes the final quantification results and confirms that enzymes ported on the same plots as the SWATH MS-extracted frag- from the glycolysis pathway show a slight (maximum 2-fold) ment intensities (supplemental Table 1). The results show down-regulation, whereas those involved in the glyoxylate that, in half of the cases (for 19 of the 38 diluted reference and citric acid cycles show between 10- and 300-fold up- peptides), SWATH MS quantification at the fragment ion regulations, consistent with the data of the SRM study. For a spectra level offers a 2–8-fold gain in sensitivity compared more direct comparison with the SRM results, we also plotted with the LOD based on precursor ion signals detected in the the proteins fold changes quantified with SWATH MS targeted MS1 maps. Supplemental Fig. S7 provides such an example analysis against those published in the SRM study. The cor- of diluted peptide (ANLIPVIAK) whose precursor is only de- relation plot (Fig. 4B) shows an excellent linear correlation tectable until 1.5 fmol in the MS1 scans, whereas its fragment between the quantification results (slope  0.9, r 0.95) and ions are still unambiguously identifiable and quantifiable down benchmarks the quantification accuracy obtained by SWATH to 180 amol by targeted data extraction of the MS/MS scans. MS targeted analysis to the level of quality delivered by SRM Finally, the LOD dilution series were analyzed on our most data acquisition. 10.1074/mcp.O111.016717–10 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis FIG.4. Quantification by SWATH MS of the abundance fold changes of 45 enzymes involved in the yeast central carbon metabolism during a diauxic shift experiment. A, schematic representation of the yeast central carbon metabolism network. The abundance fold changes of the enzymes quantified by SWATH MS (supplemental Tables 4–7) are coded with colors. The box shapes are indicative of the absolute abundances of the proteins determined for yeast in log phase growth (32). Those values constitute therefore an approximation of the absolute abundances of the enzymes at the beginning of the diauxic shift experiment, and based on which the fold changes are determined. B, correlation between the abundance fold changes of the same metabolic enzymes quantified by SRM (15) (abscissa) and SWATH MS (ordinate). The linear regression was calculated with the refined SWATH MS quantification values and without taking into account the fold changes for the proteins whose peptides presented light signal intensities below noise levels for the first time point of the diauxic shift (open diamonds), similar to the SRM study (15). Because the standard deviations for the SWATH MS and SRM quantifications are not directly comparable (see “Materials and Methods”), they are provided on separate plots in supplemental Table 5. The complete lists of peak areas for the quantified fragment ion traces and corresponding protein abundance fold changes are provided in supplemental Tables 4–7. by a first pass biological review of the data, a situation that is To demonstrate the effect of the fragment ion mass accu- common for systems biology studies. To illustrate this capa- racy and resolution on the quantification performance, we bility, the diauxic shift data sets were queried for 60 yeast artificially relaxed the resolution of the SWATH MS measure- mitochondrial proteins (287 peptides) involved in oxidative ments, postacquisition, to mimic either a data-independent ac- phosphorylation of the respiratory chain. These were not cov- quisition on a high resolution instrument but without isolation ered in the initial SRM study but were a posteriori considered window or a windowed acquisition on a low resolution instru- relevant in the context of the switch from fermentation to ment (simulating the conditions of MS /AIF (19, 21) or DIA (18) respiration that occurs upon the diauxic shift. The data re- data sets, respectively, see Table 1). This was achieved in silico analysis consisted of extracting, from the same diauxic shift either by recombining the swaths prior to fragment ion chro- files, fragment ion traces of those targeted peptides for which matogram extraction at 10-ppm mass accuracy or by extracting we had assay records in our yeast spectral libraries. From the the swaths data at 1-Da mass accuracy, respectively. The list of mitochondrial proteins, we could confidently quantify mProphet analysis results (supplemental Fig. S9) show that the abundance fold change for 36 proteins (103 peptides), 19 neither of those low specificity acquisition methods can match of which were membrane-associated proteins from the respi- the number of identifications and quantification accuracy levels ratory chain (Fig. 5 and supplemental Table 8). As for the achieved by SWATH MS, especially for the proteins of low previous analysis, the abundance fold change was measura- abundance. ble for proteins spanning a wide range of protein abundances Extending the Set of Quantified Proteins from SWATH (from 395 to 8.8E copies/cell (32)). MS Data Sets Identification of Post-translational Modifications SWATH MS data sets implicitly contain a permanent frag- ment ion spectral record for all precursors within the mass It is noteworthy that peptide modifications may also appear and hydrophobicity range covered by specific LC-MS/MS serendipitously as result of the targeted data extraction of acquisition conditions. This allows, in principle, for probing SWATH MS data sets. When the fragment ion traces used to the data sets in silico for any new protein of interest suggested query a peptide are shared with modified forms of that pep- Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–11 Targeted Data Extraction for Proteome Analysis FIG.5. Extended quantification by SWATH MS of the abundance fold changes of mitochondrial enzymes during a diauxic shift experiment. Schematic representation of the respiratory chain and oxidative phosphorylation networks inspired by the Kyoto Encyclopedia of Genes and Genomes pathway representation (47). The abundance fold changes of the enzymes quantified by SWATH MS are coded with colors. The box shapes are indicative of the absolute abundances of the proteins with the same notices as those mentioned in Fig. 4. The complete list of peptides and of their fragment ions used to quantify those proteins, as well as the peak areas and corresponding protein abundance fold changes are provided in supplemental Table 7. tide and when those are extracted in the same swath, multiple DISCUSSION peak groups matching the original query can be observed. Among the various MS-based proteomic approaches, SRM Fig. 6 illustrates such a case for the N-labeled (light) and is generally recognized as providing the most accurate and N-labeled (heavy) forms of the endogenous peptide MIEIM- reproducible quantification results. The high degree of repro- LPVFDAPQNLVEQAK (proteotypic for protein PDC1), queried ducibility is granted by the consistent recording, across the in the yeast diauxic shift sample 8 (late time point). Additional, LC, of the intensities of predefined target fragment ions. This nonshared fragment ions can then be re-extracted to distin- allows consistent tracking the abundance of specific peptides guish which peak group corresponds to the nonmodified or of interest across multiple samples. At present, however, SRM modified peptides, respectively (supplemental Fig. S9). In suffers from relatively slow analysis rates and lacks the capa- cases where the modified peptide is fragmented in a different bility to dynamically refine or expand the measured peptides/ swath, the shared fragment ion masses may still be used to proteins for extensive proteome investigations. To alleviate specifically query for the modified peptide form in that swath. most limitations of targeted data acquisition, we propose here a These data illustrate the potential of SWATH MS targeted data targeted data analysis strategy that brings the consistent and extraction for unambiguous modification site assignment by accurate quantification capabilities of SRM to a level of exten- extracting specific fragment ions characteristic of the modi- sive proteome coverage by mining the complete fragment ion fied peptide sequence. This opens completely novel oppor- records generated during data-independent acquisition. tunities to discover (and quantify) unanticipated modified pep- Not all DIA methods may be appropriate for targeted data tide species from DIA data sets by a strategy that does not suffer from the combinatorial explosion of the search space extraction. To reach the quantification accuracy of SRM with usually experienced with traditional post-translational modifi- targeted data extraction, the LC-MS/MS acquisition has to cation database search approaches. provide fragment ion data of a level of specificity that is 10.1074/mcp.O111.016717–12 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis FIG.6. Application of SWATH MS targeted analysis to identify peptide modifications. The six most intense fragment ion traces for the 14 15 N-labeled (light) and N-labeled (heavy) forms of the peptide MIEIMLPVFDAPQNLVEQAK, extracted from the swath 750–775, are shown for the yeast diauxic shift sample y8 (late time point). None of the classical SRM criteria (fragments co-elution, light-heavy peptide co-elution, relative intensities of the fragment ions) can discriminate the three candidate peak groups found here. By extracting additional, nonshared fragment ion traces, the identification of the peptide can be confirmed, and the site of the oxidized methionine modification can be unambiguously assigned onto the peptide sequence (supplemental Fig. S10). comparable with that of SRM. Based on our fragment ion similar to those achieved by SWATH MS may very well be interference simulations (Fig. 2), we adopted a sequential reached by other DIA methods, upon higher resolution of window DIA method operating with 25-Da isolation width. On co-eluting analytes (e.g., using multidimensional protein a fast, high resolution MS instrument, this setup allows doc- identification technology (MudPIT) (18), ultrahigh pressure umentation, in a single injection, of highly specific and time- liquid chromatography (UPLC) (19), or ion mobility shift), resolved fragment ion data for all the precursors within the although the gain in fragment ion specificity offered by 400–1200 m/z mass and the monitored LC range (Fig. 1). The extensive fractionations was recently questioned (12). data thus generated constitute a series of extensive fragment To mine the fragment ion maps recorded during SWATH ion maps ideally suited for proteome-wide investigation by MS acquisition, we devised a targeted data extraction strat- targeted data analysis. DIA acquisition using consecutive egy that conceptually transposes, to the data analysis, prin- swaths is not novel per se (18, 20). However, its rationally ciples originating from SRM targeted acquisition. This tar- designed implementation on a fast, high resolution MS instru- geted data analysis strategy differs fundamentally from the ment provides, for the first time for a DIA method, the level of traditional search approaches described so far to analyze DIA data quality necessary for targeted data extraction. This ac- data sets. Specifically, this type of analysis does not rely on quisition method is now commercially available on the precursor ion mass detection nor involve MS/MS spectra ABSciex 5600 TripleTOF instrument under the SWATH MS matching of any kind (neither using traditional database denomination. searching tools nor spectral matching algorithms). Instead, it It should be noted that this SWATH MS setup (recording consists of extracting, from the SWATH MS data sets, several 32 swaths of 25 Da at 100-ms dwell time) is only one of fragment ion chromatograms for each peptide of interest. many acquisition sets that can be applied. Like other mass Collectively, these trace groups identify the targeted peptide, spectrometric methods, SWATH MS operates within a as in SRM analysis (Fig. 1C). Because both the peptide iden- space of interdependent parameters, including dwell time, tification and quantification are performed at the MS/MS level, duty cycle, and precursor isolation window width that affect without the precursor ion signal having to be explicitly detected the limit of detection, signal specificity, dynamic range, and in the survey scans, this strategy allows extensive exploration of quantification accuracy. Depending on the biological appli- the multiplexed MS/MS DIA data sets to a level that was not cation or sample complexity, other parameters, including possible with the traditional clustering/database approaches. windows of variable widths throughout the LC gradient, This targeted extraction strategy, like SRM, depends on might prove more efficient. Also, fragment ion specificities spectral libraries as prior knowledge, to guide the selection of Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–13 Targeted Data Extraction for Proteome Analysis the optimal set of fragment ion signals. For several species, analytes in complex samples because the detection and proteome-wide reference spectral libraries have been com- quantification is based on fragment ion signals without the pleted and will be made public in the near future. These explicit need to detect the precursor ion in a survey scan 2 3 libraries are S. cerevisiae, human, and Mycobacterium tu- above noise. berculosis. Given that robust and high throughput methods The intrascan dynamic range of the method was also ex- for the generation of such libraries have been developed (e.g., perimentally assessed and was shown to cover almost 4 by systematically recording MS/MS reference spectra of orders of magnitude (Fig. 3B). Such extent of identification chemically synthesized proteotypic peptides (34)), we antici- (and quantification) of co-eluting peptides spanning 4 logs of pate that proteomes of additional species will be equally concentration, reliably detected here with targeted data ex- mapped out in the near future. Alternatively, spectral libraries traction (supplemental Fig. S6), may be more challenging to may be generated for any sample by extensive DDA analysis achieve with traditional DIA analysis approaches relying on using the same instrument. To increase the reliability of such fragment ion preclustering and/or regular database searches. libraries, consensus spectra can be generated from repeated To our knowledge, this is indeed the first attempt to objec- observations of the same peptide using freely available tools tively evaluate the intrascan dynamic range of peptide iden- (35–37). The use of reference spectra as a priori information tification/quantification for a DIA approach, even though this guiding the targeted extraction of DIA data sets may be less parameter is of utmost importance for proteome analyses, in error-prone than approaches relying on clustering the frag- particular if wide precursor isolation windows are being used. ment and precursor ions based on their LC elution profiles. It is noteworthy that the most abundant precursor actually Indeed, targeted data extraction can identify and quantify two limits the dynamic range only for its specific isolation window exactly co-eluting peptides (e.g., light and heavy labeled pep- and therefore does not affect the detection sensitivity achiev- tide forms), even if they are present at vastly different abun- able simultaneously in other swaths. Thus, for SWATH MS, an dance levels (Fig. 3B), a situation that challenges clustering even greater dynamic range may be anticipated throughout approaches (26) and requires recursive search implementa- the 400–1200 m/z range at each time point and across the tions to deconvolute the multiplexed spectra (38). LC-MS range as a whole. A wide intrascan dynamic range To evaluate the limit of detection of the method, a set of achievable in flow-through instruments like the quadrupole- isotopically labeled serial dilution experiments was performed quadrupole TOF instrument used in this study might be diffi- and showed that SWATH MS acquisition coupled to targeted cult to achieve with trapping instruments. Their limited ion data analysis could identify and quantify peptides down to the trapping capacity restricts the number of peptide species that hundred amol range (Fig. 3). Even though the method in its can be concurrently analyzed without compromising perform- current setup was slightly less sensitive than SRM, it remains ance through space charging. On quadrupole-quadrupole to be determined whether the systematic optimization of the TOF instruments, the ions are transferred through a quadru- SWATH MS acquisition parameter sets, e.g., the use of dy- pole to the collision cell and to the TOF analyzer, irrespective namically adjusted window widths and increased dwell times, of the number or abundance of co-selected precursors, a can further improve the LOD of the method. Generally, per- feature that is critical for reaching a high intrascan dynamic formance comparisons of methods are problematic if the range with DIA methods using large isolation windows that comparisons include too many variables such as different produce high ion fluxes. Also, an optimal “square shape” for samples or instrument types, instrument settings, etc. We the ion transmission efficiency (as achieved here by decou- therefore compared data acquired by SWATH MS with data pling the DC and RF voltages of the isolation quadrupole; generated by DDA and by MS1 quantification using aliquots of supplemental Figs. S1) might be difficult to maintain through- the same sample measured on the same ABSciex 5600 Trip- out the entire isolation window width on current trapping leTOF instrument. Overall, SWATH MS outperformed the two devices and may require larger overlaps between adjacent other methods for the consistent detection and quantification swaths to ensure consistent quantification of the analytes of low abundance precursors, especially if complex samples transmitted at the border of the isolation windows. Therefore, were analyzed (Fig. 3 and supplemental Figs. S6 and S7 and whereas the principles of data-independent acquisition with supplemental Tables 1 and 2). This result corroborates obser- swaths can conceivably be implemented on different types of vations from previous DIA reports (18, 20, 22, 24) and can be mass spectrometers, it appears that the characteristics of explained by an increased signal to noise ratio in the fragment flow-through systems like quadrupole-quadrupole TOFs are ion maps compared with the survey scans. This also empha- currently the best match for the method. sizes that unbiased acquisition methods such as SRM and More importantly, we evaluated the quantification repro- DIA are particularly well suited for the detection of low level ducibility achievable by SWATH MS coupled to targeted data analysis and its potentials for proteome quantification for biology. Comparing SRM- and SWATH MS-derived quantita- P. Picotti, et al., submitted for publication. tive values obtained from the same isotope labeled samples R. Aebersold, R. Moritz, et al., manuscript in preparation. (two yeast diauxic shift samples previously analyzed by SRM O. Schubert, J. Mouritsen, et al., manuscript in preparation. 10.1074/mcp.O111.016717–14 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis (15)), both methods showed highly correlated values (Fig. 4B). are neither relevant nor accessible to quadrupole resolution Overall, SWATH MS coupled to targeted data extraction al- used in SRM acquisition, they are instrumental to adequately lowed consistent quantification of proteins spanning a wide mine SWATH MS data sets. Therefore, a more complete and range of concentrations, e.g., 125–10 copies/cell (Fig. 4A, specific targeted data analysis pipeline is required before box shapes). Unlike SRM data, SWATH MS data sets are attempting exhaustive qualitative and quantitative proteome permanent records of the fragment ion spectra of a sample characterization of SWATH MS data sets. that can be re-examined in silico without the need for further The concept of SWATH MS acquisition and targeted data data acquisition. This characteristic, specific to DIA data sets, analysis should be easily extendable to other classes of opens new possibilities to rescue missing quantification infor- biomolecules such as metabolites, lipids, and more that are mation and to improve the accuracy of initial quantification also frequently studied by LC-MS/MS and for which fragment results simply through iterative targeted data reanalysis, as ion spectral libraries have been developed (41–46). Also, the demonstrated here for several metabolic enzymes (supple- possibility to re-examine patterns in the SWATH MS data sets mental Tables 5–7). It has been discussed that, for SRM enables new opportunities for finding modified residues and measurements, interference of contaminating transitions, in- search for the presence of previously unexpected analytes complete tryptic cleavage, or possible modifications of a pep- (Fig. 6). tide or other such artifacts may impede the accuracy of quan- In summary, we report a method for qualitative and quan- tification (39, 40). The optimization of fragment ion sets for titative proteome probing of a sample in a single LC-MS/MS each targeted peptides by the iterative SWATH MS data injection. This is achieved by the combination of a sequential analysis offers practical solutions to these important issues. windowed DIA method, generating exhaustive high specificity Interfering transitions can be detected and eliminated using fragment ion map records, coupled with a postacquisition outlier detection algorithms, and the data set can be queried targeted data analysis strategy. This method permits quanti- for other peptides from the targeted protein or for alternate fication of (at least) as many compounds as those typically peptides, e.g., derived by unspecific or partial cleavage or identified by regular shotgun proteomics with the accuracy modified peptides covering the same segment of a protein. and reproducibility of SRM across many samples. The Once detected, such instances can be eliminated or taken method also provides new possibilities for data analysis, al- into account to achieve higher quality data (supplemental lowing quantification refinement and dynamic protein probing Table 7). by iteratively re-mining the once-and-forever acquired data The possibility of iteratively searching the SWATH MS data sets. sets also supports ad libitum queries for protein sets. Al- though the diauxic shift samples used in this study were not Acknowledgments—We acknowledge Christine Carapito (CNRS, originally intended for the recovery of mitochondrial mem- Strasbourg, France) for early contributions in evaluating the potentials of SWATH MS. We thank Paola Picotti (Institute of Biochemistry, ETH brane proteins (15), we could confidently quantify and assess Zu¨ rich) for providing the diauxic shift samples originating from an the fold changes for 36 proteins involved in the oxidative earlier study (15). We thank Uwe Sauer and Ana Paula Oliveira (Insti- phosphorylation and respiratory networks (Fig. 5). Those pro- tute of Molecular Systems Biology, ETH Zu¨ rich) for suggesting the set teins were not covered in the initial analysis and would have of respiratory chain proteins additionally quantified in the diauxic shift required new targeted data acquisition of the samples by samples. We thank Lyle Burton (ABSciex) for active development of SRM. With the targeted data analysis strategy, the new pro- the PeakView software. L. C. G., P. N., and R. A. designed the study. S. T. implemented and tein set can simply be re-extracted in silico from the existing TM developed the acquisition method on the ABSciex 5600 TripleTOF SWATH MS data files, without the need to reinject the sample. instrument and performed the data acquisitions. H. R. computed the This dynamic extension of the search space applied to theoretical simulations of fragment ion interferences. N. S. performed SWATH MS data sets is expected to be particularly attractive the comparative measurement of the AQUA dilution series by for systems biology studies where new query hypotheses are SRM. L. R. helped implementing the SWATH MS analysis in mProphet. L. C. G. and P. N. performed the SWATH MS data analysis. generated from mathematical models based on prior data R. A. and R. B. supervised the study. analysis. Although it is in principle possible to probe SWATH MS data sets for the whole proteome of an organism at once, * This work was supported by ABSciex; European Union FP7 Pros- it is beyond the scope of this article to describe the exhaustive pects Grant 201648; SystemsX.ch, the Swiss initiative for systems quantification of all the yeast proteins detectable in those biology via the projects YeastX and PhosphonetX; ERC Proteomics diauxic shift samples by SWATH MS. Indeed, although the v3.0 Grant 233226; and European Union FP7 “Unicellsys” Grant 201142. The costs of publication of this article were defrayed in part data are already analyzable with mProphet or Skyline, none of by the payment of page charges. This article must therefore be hereby the currently available SRM analysis tools can so far fully marked “advertisement” in accordance with 18 U.S.C. Section 1734 exploit the information potential contained in SWATH MS data solely to indicate this fact. sets. For example, no SRM analysis pipeline takes into ac- □ S This article contains supplemental material. count the mass accuracy of the fragment ions, nor their iso- ** To whom correspondence should be addressed. Tel.: 41-44-633- 31-70; Fax: 41-44-633-10-51; E-mail: [email protected]. topic distribution or charge state. Although those parameters Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–15 Targeted Data Extraction for Proteome Analysis REFERENCES Miller, S. I., and Goodlett, D. R. (2009) Precursor acquisition independent from ion count: How to dive deeper into the proteomics ocean. Anal. 1. Aebersold, R., and Mann, M. (2003) Mass spectrometry-based proteomics. Chem. 81, 6481–6488 Nature 422, 198–207 21. Geiger, T., Cox, J., and Mann, M. (2010) Proteomics on an Orbitrap bench- 2. MacCoss, M. J., and Matthews, D. L. (2005) Teaching a new dog old tricks. top mass spectrometer using all ion fragmentation. Mol. Cell. Proteomics Anal. Chem. 77, 295A–302A 9, 2252–2261 3. Han, X., Aslanian, A., and Yates, J. R., 3rd (2008) Mass spectrometry for 22. Bern, M., Finney, G., Hoopmann, M. R., Merrihew, G., Toth, M. J., and proteomics. Curr. Opin. Chem. Biol. 12, 483–490 MacCoss, M. J. (2010) Deconvolution of mixture spectra from ion-trap 4. Walther, T. C., and Mann, M. (2010) Mass spectrometry-based proteomics data-independent-acquisition tandem mass spectrometry. Anal. Chem. in cell biology. J. Cell Biol. 190, 491–500 82, 833–841 5. Domon, B., and Aebersold, R. (2006) Mass spectrometry and protein anal- 23. Carvalho, P. C., Han, X., Xu, T., Cociorva, D., Carvalho Mda, G., Barbosa, ysis. Science 312, 212–217 V. C., and Yates, J. R., 3rd (2010) XDIA: Improving on the label-free 6. Kapp, E., and Schutz, F. (2007) Overview of tandem mass spectrometry data-independent analysis. Bioinformatics 26, 847–848 (MS/MS) database search algorithms, in Current Protocols in Protein 24. Panchaud, A., Jung, S., Shaffer, S. A., Aitchison, J. D., and Goodlett, D. R. Science, Chapter 25, pp. 25.2.1–25.2.19, John Wiley & Sons, Inc, Hobo- (2011) Faster, quantitative, and accurate precursor acquisition independ- ken, New Jersey, USA ent from ion count. Anal. Chem. 83, 2250–2257 7. Nesvizhskii, A. I. (2007) Protein identification by tandem mass spectrometry 25. Wong, J. W., Schwahn, A. B., and Downard, K. M. (2009) ETISEQ: An and sequence database searching. Methods Mol. Biol. 367, 87–119 algorithm for automated elution time ion sequencing of concurrently 8. Lange, V., Picotti, P., Domon, B., and Aebersold, R. (2008) Selected reac- fragmented peptides for mass spectrometry-based proteomics. BMC tion monitoring for quantitative proteomics: a tutorial. Mol Syst. Biol. Bioinformatics 10:244, 1–10 4:222, 1–14 26. Geromanos, S. J., Vissers, J. P., Silva, J. C., Dorschel, C. A., Li, G. Z., 9. Reiter, L., Rinner, O., Picotti, P., Hu¨ ttenhain, R., Beck, M., Brusniak, M. Y., Gorenstein, M. V., Bateman, R. H., and Langridge, J. I. (2009) The Hengartner, M. O., and Aebersold, R. (2011) mProphet: Automated data detection, correlation, and comparison of peptide precursor and product processing and statistical validation for large-scale SRM experiments. ions from data independent LC-MS with data dependant LC-MS/MS. Nat Methods 8, 430–435 Proteomics 9, 1683–1695 10. Domon, B., and Aebersold, R. (2010) Options and considerations when 27. Li, G. Z., Vissers, J. P., Silva, J. C., Golick, D., Gorenstein, M. V., and selecting a quantitative proteomics strategy. Nat. Biotechnol. 28, Geromanos, S. J. (2009) Database searching and accounting of multi- 710–721 plexed precursor and product ion spectra from the data independent 11. Liu, H., Sadygov, R. G., and Yates, J. R., 3rd (2004) A model for random analysis of simple and complex peptide mixtures. Proteomics 9, sampling and estimation of relative protein abundance in shotgun pro- 1696–1719 teomics. Anal. Chem. 76, 4193–4201 28. Blackburn, K., Mbeunkui, F., Mitra, S. K., Mentzel, T., and Goshe, M. B. 12. Michalski, A., Cox, J., and Mann, M. (2011) More than 100,000 detectable (2010) Improving protein and proteome coverage through data-indepen- peptide species elute in single shotgun proteomics runs but the majority dent multiplexed peptide fragmentation. J. Proteome Res. 9, 3621–3637 is inaccessible to data-dependent LC-MS/MS. J. Proteome Res. 10, 29. Sherman, J., McKay, M. J., Ashman, K., and Molloy, M. P. (2009) Unique 1785–1793 ion signature mass spectrometry, a deterministic method to assign pep- 13. Addona, T. A., Abbatiello, S. E., Schilling, B., Skates, S. J., Mani, D. R., tide identity. Mol. Cell. Proteomics 8, 2051–2062 Bunk, D. M., Spiegelman, C. H., Zimmerman, L. J., Ham, A. J., Kesh- 30. de Godoy, L. M., Olsen, J. V., Cox, J., Nielsen, M. L., Hubner, N. C., ishian, H., Hall, S. C., Allen, S., Blackman, R. K., Borchers, C. H., Buck, Fro¨ hlich, F., Walther, T. C., and Mann, M. (2008) Comprehensive mass- C., Cardasis, H. L., Cusack, M. P., Dodder, N. G., Gibson, B. W., Held, spectrometry-based proteome quantification of haploid versus diploid J. M., Hiltke, T., Jackson, A., Johansen, E. B., Kinsinger, C. R., Li, J., yeast. Nature 455, 1251–1254 Mesri, M., Neubert, T. A., Niles, R. K., Pulsipher, T. C., Ransohoff, D., 31. Picotti, P., Lam, H., Campbell, D., Deutsch, E. W., Mirzaei, H., Ranish, J., Rodriguez, H., Rudnick, P. A., Smith, D., Tabb, D. L., Tegeler, T. J., Domon, B., and Aebersold, R. (2008) A database of mass spectrometric Variyath, A. M., Vega-Montoto, L. J., Wahlander, A., Waldemarson, S., assays for the yeast proteome. Nat. Methods 5, 913–914 Wang, M., Whiteaker, J. R., Zhao, L., Anderson, N. L., Fisher, S. J., 32. Ghaemmaghami, S., Huh, W. K., Bower, K., Howson, R. W., Belle, A., Liebler, D. C., Paulovich, A. G., Regnier, F. E., Tempst, P., and Carr, S. A. Dephoure, N., O’Shea, E. K., and Weissman, J. S. (2003) Global analysis (2009) Multi-site assessment of the precision and reproducibility of mul- of protein expression in yeast. Nature 425, 737–741 tiple reaction monitoring-based measurements of proteins in plasma. 33. MacLean, B., Tomazela, D. M., Shulman, N., Chambers, M., Finney, G. L., Nat. Biotechnol. 27, 633–641 Frewen, B., Kern, R., Tabb, D. L., Liebler, D. C., and MacCoss, M. J. 14. Cima, I., Schiess, R., Wild, P., Kaelin, M., Schu¨ ffler, P., Lange, V., Picotti, P., (2010) Skyline: An open source document editor for creating and ana- Ossola, R., Templeton, A., Schubert, O., Fuchs, T., Leippold, T., Wyler, lyzing targeted proteomics experiments. Bioinformatics 26, 966–968 S., Zehetner, J., Jochum, W., Buhmann, J., Cerny, T., Moch, H., Gil- 34. Picotti, P., Rinner, O., Stallmach, R., Dautel, F., Farrah, T., Domon, B., lessen, S., Aebersold, R., and Krek, W. (2011) Cancer genetics-guided Wenschuh, H., and Aebersold, R. (2010) High-throughput generation of discovery of serum biomarker signatures for diagnosis and prognosis of selected reaction-monitoring assays for proteins and proteomes. Nat. prostate cancer. Proc. Natl. Acad. Sci. U.S.A. 108, 3342–3347 Methods 7, 43–46 15. Picotti, P., Bodenmiller, B., Mueller, L. N., Domon, B., and Aebersold, R. 35. Craig, R., Cortens, J. C., Fenyo, D., and Beavis, R. C. (2006) Using anno- (2009) Full dynamic range proteome analysis of S. cerevisiae by targeted tated peptide mass spectrum libraries for protein identification. J. Pro- proteomics. Cell 138, 795–806 teome Res. 5, 1843–1849 16. Kiyonami, R., Schoen, A., Prakash, A., Peterman, S., Zabrouskov, V., 36. Frewen, B. E., Merrihew, G. E., Wu, C. C., Noble, W. S., and MacCoss, M. J. Picotti, P., Aebersold, R., Huhmer, A., and Domon, B. (2011) Increased (2006) Analysis of peptide MS/MS spectra from large-scale proteomics selectivity, analytical precision, and throughput in targeted proteomics. experiments using spectrum libraries. Anal. Chem. 78, 5678–5684 Mol. Cell. Proteomics 10, M110.002931 17. Purvine, S., Eppel, J. T., Yi, E. C., and Goodlett, D. R. (2003) Shotgun 37. Lam, H., Deutsch, E. W., Eddes, J. S., Eng, J. K., King, N., Stein, S. E., and collision-induced dissociation of peptides using a time of flight mass Aebersold, R. (2007) Development and validation of a spectral library analyzer. Proteomics 3, 847–850 searching method for peptide identification from MS/MS. Proteomics 7, 18. Venable, J. D., Dong, M. Q., Wohlschlegel, J., Dillin, A., and Yates, J. R. 655–667 (2004) Automated approach for quantitative analysis of complex peptide 38. Huang, X., Liu, M., Nold, M. J., Tian, C., Fu, K., Zheng, J., Geromanos, S. J., mixtures from tandem mass spectra. Nat. Methods 1, 39–45 and Ding, S. J. (2011) Software for quantitative proteomic analysis using stable isotope labeling and data independent acquisition. Anal. Chem. 19. Plumb, R. S., Johnson, K. A., Rainville, P., Smith, B. W., Wilson, I. D., 83, 6971–6979 Castro-Perez, J. M., and Nicholson, J. K. (2006) UPLC/MS(E): A new 39. Duncan, M. W., Yergey, A. L., and Patterson, S. D. (2009) Quantifying approach for generating molecular fragment information for biomarker structure elucidation. Rapid Commun. Mass Spectrom. 20, 1989–1994 proteins by mass spectrometry: The selectivity of SRM is only part of the 20. Panchaud, A., Scherl, A., Shaffer, S. A., von Haller, P. D., Kulasekara, H. D., problem. Proteomics 9, 1124–1127 10.1074/mcp.O111.016717–16 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis 40. Sherman, J., McKay, M. J., Ashman, K., and Molloy, M. P. (2009) How (2010) Detection and identification of 700 drugs by multi-target screening specific is my SRM?: The issue of precursor and product ion redun- with a 3200 Q TRAP LC-MS/MS system and library searching. Anal. dancy. Proteomics 9, 1120–1123 Bioanal. Chem. 396, 2425–2434 41. Schmelzer, K., Fahy, E., Subramaniam, S., and Dennis, E. A. (2007) The 47. Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., and Hirakawa, M. (2010) Lipid Maps Initiative in Lipidomics, po. 171–183, Elsevier Science Pub- KEGG for representation and analysis of molecular networks involving lishers B.V., Amsterdam diseases and drugs. Nucleic Acids Res. 38, D355–D360 42. Blanksby, S. J., and Mitchell, T. W. (2010) Advances in mass spectrometry 48. Andrews, G. L., Simons, B. L., Young, J. B., Hawkridge, A. M., and Mud- for lipidomics. Annu. Rev. Anal. Chem. 3, 433–465 diman, D. C. (2011) Performance characteristics of a new hybrid qua- 43. Smith, C. A., O’Maille, G., Want, E. J., Qin, C., Trauger, S. A., Brandon, drupole time-of-flight tandem mass spectrometer (TripleTOF 5600). Anal. T. R., Custodio, D. E., Abagyan, R., and Siuzdak, G. (2005) METLIN: A Chem. 83, 5442–5446 metabolite mass spectral database. Ther. Drug Monit. 27, 747–751 49. Wepf, A., Glatter, T., Schmidt, A., Aebersold, R., and Gstaiger, M. (2009) 44. Horai, H., Arita, M., Kanaya, S., Nihei, Y., Ikeda, T., Suwa, K., Ojima, Y., Quantitative interaction proteomics using mass spectrometry. Nat. Tanaka, K., Tanaka, S., Aoshima, K., Oda, Y., Kakazu, Y., Kusano, M., Methods 6, 203–205 Tohge, T., Matsuda, F., Sawada, Y., Hirai, M. Y., Nakanishi, H., Ikeda, K., 50. Krokhin, O. V., Craig, R., Spicer, V., Ens, W., Standing, K. G., Beavis, R. C., Akimoto, N., Maoka, T., Takahashi, H., Ara, T., Sakurai, N., Suzuki, H., and Wilkins, J. A. (2004) An improved model for prediction of retention Shibata, D., Neumann, S., Iida, T., Tanaka, K., Funatsu, K., Matsuura, F., times of tryptic peptides in ion pair reversed-phase HPLC: Its application Soga, T., Taguchi, R., Saito, K., and Nishioka, T. (2010) MassBank: A to protein peptide mapping by off-line HPLC-MALDI MS. Mol. Cell. public repository for sharing mass spectral data for life sciences. J. Mass Proteomics 3, 908–919 Spectrom. 45, 703–714 51. Rost, H. L., Malmstrom, L., and Ruedi Aebersold, R. (2012) A computational 45. Dresen, S., Gergov, M., Politi, L., Halter, C., and Weinmann, W. (2009) tool to detect and avoid redundancy in selected reaction monitoring. ESI-MS/MS library of 1,253 compounds for application in forensic and Mol. Cell. Proteomics, mcp.M111.013045. First Published on April 24, clinical toxicology. Anal. Bioanal. Chem. 395, 2521–2526 46. Dresen, S., Ferreiro´ s, N., Gnann, H., Zimmermann, R., and Weinmann, W. 2012, doi:10.1074/mcp.M111.013045 Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–17 http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Molecular & Cellular Proteomics American Society for Biochemistry and Molecular Biology

Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and Accurate Proteome Analysis *

Loading next page...
 
/lp/american-society-for-biochemistry-and-molecular-biology/targeted-data-extraction-of-the-ms-ms-spectra-generated-by-data-j0Pclkf5cw

References (54)

Publisher
American Society for Biochemistry and Molecular Biology
Copyright
Copyright © 2012 Elsevier Inc.
ISSN
1535-9476
eISSN
1535-9484
DOI
10.1074/mcp.o111.016717
Publisher site
See Article on Publisher Site

Abstract

Technological Innovation and Resources © 2012 by The American Society for Biochemistry and Molecular Biology, Inc. This paper is available on line at http://www.mcponline.org Targeted Data Extraction of the MS/MS Spectra Generated by Data-independent Acquisition: A New Concept for Consistent and □ S Accurate Proteome Analysis* Ludovic C. Gillet‡, Pedro Navarro‡, Stephen Tate§, Hannes Ro¨ st‡, Nathalie Selevsek‡, Lukas Reiter‡¶, Ron Bonner§, and Ruedi Aebersold‡** Most proteomic studies use liquid chromatography cou- tation and targeted data extraction alleviates most con- pled to tandem mass spectrometry to identify and quan- straints of present proteomic methods and should be tify the peptides generated by the proteolysis of a biolog- equally applicable to the comprehensive analysis of other ical sample. However, with the current methods it remains classes of analytes, beyond proteomics. Molecular & challenging to rapidly, consistently, reproducibly, accu- Cellular Proteomics 11: 10.1074/mcp.O111.016717, 1–17, rately, and sensitively detect and quantify large fractions 2012. of proteomes across multiple samples. Here we present a new strategy that systematically queries sample sets for the presence and quantity of essentially any protein of Liquid chromatography coupled to tandem mass spec- interest. It consists of using the information available in trometry (LC-MS/MS) is considered the method of choice for fragment ion spectral libraries to mine the complete frag- the identification and quantification of proteins and pro- ment ion maps generated using a data-independent ac- quisition method. For this study, the data were acquired teomes (1–4) and for the analysis of metabolites, lipids, gly- on a fast, high resolution quadrupole-quadrupole time-of- cans, and many other types of (bio)molecules. For proteo- flight (TOF) instrument by repeatedly cycling through 32 mics, two main LC-MS/MS strategies have been used thus consecutive 25-Da precursor isolation windows (swaths). far. They have in common that the sample proteins are con- This SWATH MS acquisition setup generates, in a single verted by proteolysis into peptides, which are then separated sample injection, time-resolved fragment ion spectra for by (capillary) liquid chromatography. They differ in the mass all the analytes detectable within the 400–1200 m/z pre- spectrometric method used. The first and most widely used cursor range and the user-defined retention time window. strategy is known as shotgun or discovery proteomics. For We show that suitable combinations of fragment ions this method, the MS instrument is operated in data-depen- extracted from these data sets are sufficiently specific to dent acquisition (DDA) mode, where fragment ion (MS2) spec- confidently identify query peptides over a dynamic range tra for selected precursor ions detectable in a survey (MS1) of 4 orders of magnitude, even if the precursors of the queried peptides are not detectable in the survey scans. scan are generated (5). The resulting fragment ion spectra are We also show that queried peptides are quantified with a then assigned to their corresponding peptide sequences by consistency and accuracy comparable with that of se- sequence database searching (6, 7). The second main strat- lected reaction monitoring, the gold standard proteomic egy is referred to as targeted proteomics. There, the MS quantification method. Moreover, targeted data extrac- instrument is operated in selected reaction monitoring (SRM) tion enables ad libitum quantification refinement and dy- (also called multiple reaction monitoring) mode. With this namic extension of protein probing by iterative re-mining method, a sample is queried for the presence and quantity of of the once-and-forever acquired data sets. This combi- a limited set of peptides that have to be specified prior to data nation of unbiased, broad range precursor ion fragmen- acquisition. SRM does not require the explicit detection of the targeted precursors but proceeds by the acquisition, sequen- tially across the LC retention time domain, of predefined pairs From the ‡Department of Biology, Institute of Molecular Systems Biology, Eidgeno¨ ssische Technische Hochschule Zu¨ rich (ETH Zürich), of precursor and product ion masses, called transitions, sev- 8093 Zürich, Switzerland, §ABSciex, Concord, L4K 4V8 Ontario, Can- eral of which constitute a definitive assay for the detection of ada, ¶Biognosys AG, 8952 Schlieren, Switzerland, and the Faculty of Science, University of Zu¨ rich, 8057 Zu¨ rich, Switzerland Received December 19, 2011, and in revised form, January 18, The abbreviations used are: LC-MS/MS, liquid chromatography 2012 coupled to tandem mass spectrometry; DDA, data-dependent acqui- Published, MCP Papers in Press, January 18, 2012, DOI sition; DIA, data-independent acquisition; SRM, single reaction mon- 10.1074/mcp.O111.016717 itoring; RT, retention time; LOD, limit of detection. This is an open access article under the CC BY license. Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–1 Targeted Data Extraction for Proteome Analysis TABLE I LC time-resolved data-independent acquisition setups: description and current performance profiles The numbers reported here correspond to the percentages of peptides observable with at least five interference-free transitions in a yeast background, as reported in Fig. 2B. a peptide in a complex sample (8). Data analysis in targeted of concurrently fragmented precursors and therefore the proteomics essentially consists of computing the likelihood complexity of the acquired composite fragment ion spectra. that a group of transition signal traces are derived from the To date, the composite spectra generated by DIA methods targeted peptide (9). Both methods have different and largely have been principally analyzed with the standard database complementary preferred uses and performance profiles that searching tools developed for DDA, either by searching the have been extensively discussed elsewhere (10). Specifically, composite MS2 spectra directly (18, 20) or by searching shotgun proteomics is the method of choice for discovering pseudo MS2 spectra reconstituted postacquisition based on the maximal number of proteins from one or a few samples. It the co-elution profiles of precursor ions (from the survey does, however, have limited quantification capabilities on scans) and of their potentially corresponding fragment ions large sample sets because of stochastic and irreproducible (22, 25–28). precursor ion selection (11) and under-sampling (12). In con- Here, we report an alternative approach to proteome quan- trast, targeted proteomics is well suited for the reproducible tification that combines a high specificity DIA method with a detection and accurate quantification of sets of specific pro- novel targeted data extraction strategy to mine the resulting teins in many samples as is the case in biomarker or systems fragment ion data sets. For the data acquisition, we imple- biology studies (13–15). At present, however, the method is ment the sequential isolation window acquisition principle limited to the measurements of a few thousands transitions introduced by former DIA studies (18, 20) on a high resolution per LC-MS/MS run (16). It therefore lacks the throughput to MS instrument. This time- and mass-segmented acquisition routinely quantify large fractions of a proteome. method generates, in a single injection, fragment ion spectra To alleviate the limitations of either method, strategies have of all precursor ions within a user-defined precursor RT and been developed that rely on neither detection nor knowledge m/z space and records the ensemble of these fragment ion of the precursor ions to trigger acquisition of fragment ion spectra as complex fragment ion maps. Using computer sim- spectra. Those methods operate via unbiased “data-indepen- ulations we show that the resulting maps achieve the highest dent acquisition” (DIA), in the cyclic recording, throughout the fragment ion specificity of any DIA method described to date. LC time range, of consecutive survey scans and fragment ion We term this acquisition strategy “SWATH MS,” in reference spectra for all the precursors contained in predetermined to the swaths that are conceptually referred to designate the isolation windows. Various implementations of DIA methods series of isolation windows acquired for a given precursor have already been described using isolation windows of var- mass range across the LC. ious widths, ranging from the complete m/z range to few To analyze the high specificity, multiplexed data sets gen- Daltons (17–24) (Table I). Using such scans, the link between erated by SWATH MS, we developed a novel data analysis the fragment ions and the precursors from which they origi- strategy that fundamentally differs from the database search nate is lost, complicating the analysis of the acquired data approaches used so far to identify peptides from DIA data sets. Also, large selection window widths increase the number sets. It consists of using a targeted data extraction strategy to 10.1074/mcp.O111.016717–2 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis (TOF-MS) was collected, from which the top 20 ions were selected for query the acquired fragment ion maps for the presence and automated MS/MS in subsequent experiments where each MS/MS quantity of specific peptides of interest, using a priori infor- event consisted of a 50-ms scan. The selection criteria for parent ions mation contained in spectral libraries. Practically, the frag- included intensity, where ions had to be greater than 150 counts/s ment ion signals, their relative intensities, chromatographic with a charge state greater than 1 and were not present on the concurrence, and other information accessible from a spectral dynamic exclusion list. Once an ion had been fragmented by MS/MS, its mass and isotopes were excluded for a period of 15 s. Ions were library for each targeted peptide are used to mine the DIA isolated using a quadrupole resolution of 0.7 Da and fragmented in fragment ion maps for constellations of signals that precisely the collision cell using collision energy ramped from 15 to 45 eV within correlate with the known coordinates of a targeted peptide, the 50-ms accumulation time. In the instances where there were less thus uniquely identifying the peptide in the map. The extrac- than 20 parent ions that met the selection criteria, those ions that did tion of fragment ion traces from data-independently acquired were subjected to longer accumulation times to maintain a constant samples sets has been reported for the quantification of for- total cycle time of 1.25 s. For SWATH MS-based experiments, the mass spectrometer was merly identified peptides (18); however, this strategy has operated in a looped product ion mode. In this mode, the instrument never been purposely used to systematically search and iden- was specifically tuned to allow a quadrupole resolution of 25 Da/mass tify peptides from the fragment ion maps of DIA data sets. selection. The stability of the mass selection was maintained by the Indeed, it is only with the increasing availability of proteome- operation of the Radio Frequency (RF) and Direct Current (DC) volt- wide spectral libraries that this targeted data extraction strat- ages on the isolation quadrupole in an independent manner. Using an isolation width of 26 Da (25 Da of optimal ion transmission effi- egy becomes largely applicable to mine the acquired data ciency  1 Da for the window overlap), a set of 32 overlapping sets for peptides never identified thus far with regular shotgun windows was constructed covering the mass range 400–1200 Da. proteomics approaches. Consecutive swaths need to be acquired with some precursor isola- We show that the combination of high specificity fragment tion window overlap to ensure the transfer of the complete isotopic ion maps and targeted data analysis using information from pattern of any given precursor ion in at least one isolation window and spectral libraries of complete organisms offers unprece- thereby to maintain optimal correlation between parent and fragment isotopes peaks at any LC time point (supplemental Fig. S1, a–f). This dented possibilities for the qualitative and quantitative prob- overlap was reduced to a minimum of 1 Da, which experimentally ing of proteomes. This approach should be applicable be- matched the almost squared shape of the fragment ion transmission yond proteomics to other “omics” measurements, including profile achieved through the specific quadrupole tuning developed for metabolomics and lipidomics, or to forensics or biomedical SWATH MS (supplemental Fig. S1, g and h). The windows setups analytics fields, which require accurate quantitative analysis used for these runs were as follows: Experiment 1: MS1 scan (see below); Experiment 2: 400–426; Experiment 3: 425–451… Experi- of as many analytes as possible from a LC-MS/MS single ment 33: 1175–1201. Those isolation windows of 26-Da width sample injection. (25 Da  1 Da) are the “nominal” windows used to compute the RF/DC voltages used to drive the isolation quadrupole during the MATERIALS AND METHODS acquisition. However, because the isolation windows are only “almost TM square shapes” (supplemental Fig S1, g and h), 0.3–0.5 Da of ion LC-MS Sample Acquisition—A commercial 5600 TripleTOF transmission can be estimated as being “lost” on either sides of the (ABSciex, Concord, Canada) was used for all the experiments. The windows. The “100% efficient” transmission of precursor ions is instrument was coupled with an Eksigent 1D Nano LC system therefore happening only for 25 Da effectively. In other words, the (Eksigent, Dublin, CA) for the stable isotope dilution experiments or “effective” isolation windows can be considered as being 400.5– with an Eksigent NanoLC-2DPlus with nanoFlex cHiPLC system for 425.5, 425.5–450.5, etc. (plus the potential overlap left from the the diauxic shift sample acquisition. The same solvents were used on nominal window transmission). The collision energy for each window both LC systems, with solvent A being composed of 0.1% (v/v) formic was determined based on the appropriate collision energy for a 2 acid in water and solvent B comprising 95% (v/v) acetonitrile with ion centered upon the window with a spread of 15 eV. This ensured 0.1% (v/v) formic acid. The serial dilution experiments were per- optimal fragmentation for the broad range of precursors co-selected formed with a customer-packed emitter, which was created using a within the isolation windows. An accumulation time of 100 ms was laser puller to an orifice of 4 m and packed with 3-m Zorbax C18 used for each fragment ion scan and for the (optional) survey scans column using a pressure bomb. The samples were loaded directly acquired at the beginning of each cycle. This results in a total duty onto this column from the nano LC system at a flow rate of 500 cycle of 3.3 s (3.2 s total for stepping through the 32 isolation nlmin . The loaded material was eluted from this column in a linear windows  0.1 s for the optional survey scan). The mass resolution gradient of 5% solvent B to 30% solvent B over 90 min. The column was between 15,000 and 30,000 for the MS/MS scans, depending on was regenerated by washing at 90% solvent B for 10 min and re- the mode used to record the SWATH MS data sets (high sensitivity or equilibrated at 5% solvent B for 10 min. The diauxic shift sample high resolution). For this study, the high sensitivity mode was used, acquisitions were performed using a “trap and elute” configuration on the nanoFlex system. The trap column (200 m  0.5 mm) and the which still allows accurate extraction of the fragment ion masses at analytical column (75 m  15 cm) were packed with 3 m ChromXP 10–50 ppm accuracy (optimal extraction for the area under curve of C18 medium. The samples were loaded at a flow rate of 2 lmin for the MS/MS profile signals at half peak width). 10 min and eluted from the analytical column at a flow rate of 300 Fragment Ion Interference Simulations—To generate the back- nlmin in a linear gradient of 5% solvent B to 35% solvent B in 155 ground for our simulations, the Saccharomyces cerevisiae protein min. The column was regenerated by washing at 80% solvent B for 10 sequences were downloaded from ensembl.org (release 57_1j). The min and re-equilibrated at 5% solvent B for 10 min. peptide set resulting from trypsin proteolysis (no missed cleavages) For standard data-dependent analysis experiments, the mass was generated in silico using carbamidomethyl cysteine as fixed spectrometer was operated in a manner where a 250-ms survey scan modification. We then selected the peptides with theoretical precur- Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–3 Targeted Data Extraction for Proteome Analysis sor ion charge states 2 and 3 and with the monoisotopic and the 91 amol total amount loaded on column for a 2-l injection) in 2-fold first C isotopic masses (0 and 1 Da) within the mass range of steps. The reference peptides were spiked in a 1 g/l concentrated 15 15 400 to 1,200 m/z. For each of those precursor ions, the theoretical set N-labeled yeast tryptic digest as proteomic background ( N-la- of fragment ions was generated (all b and y ions of charge 1 and beled yeast trypsin digest was used as proteomic background to 2), giving rise to transition pairs. This data set contained 111,880 avoid the interference between the b fragment ions from endogenous peptides (corresponding to 6,557 proteins) resulting in 194,314 dou- N yeast peptides and those from the reference peptides). The bly and triply charged precursors (388,781 overall, taking into account peptides kept at the constant concentration through all the samples the monoisotope and first C isotope) and in 10,004,504 transitions were used to estimate the coefficient of variance throughout the altogether that constituted thus the background of our simulations. experiment, whereas the other 38 diluted peptides were used to We also prepared a reduced data set that only contained the precur- estimate the limit of detection and limit of quantification of the SWATH sors of peptides that were reported in the PeptideAtlas (Yeast MS acquisition method. The raw values (nondenoised, nonsmoothed) PeptideAtlas 200904 build, also containing the MS-identified modifi- of the extracted peak areas for each of the fragment ions of the cations and nontryptic peptides). This reduced data set contained peptides in each of the samples are provided in the supplemental 48,087 peptides (corresponding to 3,898 proteins), resulting in 93,875 Table 1. For the limit of detection and quantification, a peak group doubly and triply charged precursors (187,777 overall, taking into composed of the three extracted fragment ion traces as well as a account the monoisotope and first C isotope) and in 5,476,964 signal to noise ratio above 3 (respectively 10) for the considered peak transitions altogether that we used as a more realistic proteomic were required. For the comparison with the MS1/label-free based background. The retention times of the peptides were computed analysis, the precursor ion traces were extracted from the survey using the SSRCalc algorithm (50). To estimate the number of SRM scans of the SWATH MS acquisition. Because the accumulation time interferences, we generated in silico query assays for all proteotypic in SWATH MS mode was of 100 ms for each fragment ion scan and peptides of the yeast genome as targets. We considered all singly for the survey scans, the peak areas of the fragment ion traces charged b and y transitions of the monoisotopic 2 precursor of (extracted from their swath) and of the precursor ion traces (extracted those peptides as targets and ran them against the computed back- from the survey scan) were directly and fairly comparable. Increasing grounds (theoretical yeast digest or PeptideAtlas) and recorded an the acquisition time for the survey scan could have increased the interference whenever a transition from the background (that did not signal of the precursors of interest but would have also equally belong to the query peptide) was within a specified distance of Q1, increased the overall noise/interference signals, therefore not really Q3, and RT from the target queried peptide. For each target peptide, affecting the detection/quantification of those (supplemental Fig. S7, the number of transitions that were interfered with was recorded and a5–a8). Also, 100-ms acquisition time for MS1 scans is anyway in the later used to compute the statistics. The detailed algorithm for the range of what may be experimentally used during a typical shotgun computation of the product ion interferences will be the subject of a experiment aiming at high numbers of precursor ion selection and separate study (51). This algorithm essentially expands on the prin- identification while still allowing for MS1 label-free quantification (48). ciple of the “unique ion signatures” described by Sherman et al. (29) For the comparison with shotgun analysis, the same serial dilution by taking into account peptide RT as an additional constraint for the samples that had been acquired by SWATH MS were reacquired with calculation of fragment ion interferences. It should be acknowledged a “top 20” DDA method and searched with Mascot for the peptides that the current algorithm does not simulate the peptide signal inten- identification. The peptides used for the intrascan dynamic range sities. Even if, in theory, the different MS response factors of those experiment have already been used in a different context in our peptides could be retrieved from, for example, the PeptideAtlas da- laboratory (49). They consist of the two following sequences: 13 15 tabase, it is unlikely that those response factors can be extrapolated AADITSL*YK* (where L* indicates C , N, and K* indicates 13 15 from one sample to another or from one study to the other, because C , N ) for the doubly isotopically labeled peptide 1 and AADIT- 6 2 13 15 of ion suppression effects during the ionization. However, it was not SLYK* (where K* indicates C , N ) for the singly isotopically la- 6 2 the aim of these theoretical simulations to perfectly depict the reality, beled peptide 2. For the intrascan dynamic range experiment, the but rather to give an impression about the overall ranking of the singly isotopically labeled peptide 2 was kept as a constant concen- different Q1/Q3 scenarios. In this respect, the simulations are valid tration of 625 fmol/l (1.25 fmol total amount loaded on column for a because upon increasing the background for the fragment ion simu- 2-l injection) in all the samples, whereas the doubly isotopically lations (from 93,875 to 194,314 precursors), the overall ranking of the labeled peptide 1 was serially diluted from 625 fmol/l to 305 amol/l scenarios is maintained. This means that those simulations may not (from 1.25 pmol to 610 amol total amount loaded on column for a capture the exact reality but can be perfectly used as a tool to 2-l injection) in 2-fold steps. Those peptides were spiked in a 1 g/l compare the extent of fragment ion interference for different scenar- concentrated yeast tryptic digest as proteomic background. The ios, exactly as it is applied here. Finally, a recent study has experi- samples were acquired by SWATH MS. The fragment ion traces of the mentally quantified the extent of fragment ion interference in a human three most intense fragments were extracted from their swath (475– cell lysate (12), and the results are in good agreement with our 500 m/z in this case), and the precursor ion traces were extracted simulations overall. from the survey scans from the SWATH MS data sets. The raw values Serial Dilution Samples for the Limit of Detection, Limit of Quanti- (nondenoised and nonsmoothed) of those extracted peak areas in fication, and Intrascan Dynamic Range Assessment—All of the isoto- each of the samples are provided in the supplemental Table 2. The pically labeled reference peptides were ordered from Thermo or fragment ion traces and MS/MS spectra around the y7 fragment at Sigma-Aldrich with amino acid analysis-certified concentrations. The the RT of the peptides elution are provided as supplemental Fig. S6. sequences of the 61 peptides used for the limit of detection experi- Data Analysis—As for SRM targeted acquisition, the three to five ment, as well as their precursor ion masses (defining the swath in most intense fragments (of proteotypic peptides, as reported in the which to extract the daughter ions) and fragment ion masses (used to spectral libraries) were typically selected to perform the targeted data extract the fragment ion chromatograms), are provided in supplemen- analysis of the SWATH MS data sets. Because those MS/MS spectra tal Table 1. From those 61 peptides, 23 were kept at a constant libraries were usually generated on low resolution instruments (i.e., concentration of 23.5 fmol/l (47 fmol total amount loaded on column triple quadrupoles or ion traps), the high mass accuracy value of fora2-l injection) in all the samples, whereas the other 38 peptides those fragments was recalculated theoretically based on the amino l to 45 amol/l (from 47 fmol to were serially diluted from 23.5 fmol/ acid sequence of the peptide. Those high mass accuracy fragment 10.1074/mcp.O111.016717–4 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis ion masses were then used as seeds for extracting ion chromato- of (i) a lysate of yeast cells grown in regular N medium and sampled grams in the right LC-MS/MS swath map (indicated by the precursor throughout the metabolic shift from fermentation to respiration spiked ion mass). In the case of “borderline” peptides (i.e., peptides with with (ii) a constant N-labeled yeast lysate background used as an precursor mass falling at an edge of an isolation window or in the zone internal standard for the fold change calculations. For our analysis, we of isolation window overlap), the fragment ion traces were extracted reacquired, in SWATH MS mode, samples 1 and 8 prepared for that in the swath with borders furthest away from the precursor ion mass study, which constitute the start and end points of the diauxic shift and/or containing most of the isotopic distribution of the precursor (15). The biological triplicates of samples 1 and 8, prepared for the ion. All of the fragment ion chromatograms were extracted and au- SRM study, were pooled for the SWATH MS reacquisition to reach enough volume for the sample injection. However, this pooling pre- tomatically integrated with PeakView (v. 1.1.0.0). The raw peak areas vented us from comparing the standard deviations directly between as reported by PeakView were used for all the quantification calcu- the two studies (the SRM analysis contained the biological sample lations with no data processing (neither denoising nor smoothing, preparation variability from the triplicate analysis). Therefore, we de- etc.) of any kind applied to the extracted ion chromatograms. cided not to present the error bars corresponding to the SRM and To assess the detection of a peptide, we used as a first pass the LC SWATH MS standard deviations in Fig. 4B, because they actually validation criteria for the extracted fragment ions traces suggested by captured different information. The error bars were, however, re- Reiter et al. (9): co-elution, peak shape similarity, correlation of the ported individually for the SRM and the SWATH MS study in sepa- relative intensities with reference spectra, correlation of the relative rated plots provided in supplemental Table 5 for the reader to appre- intensities with those of a spike-in reference peptide, co-elution with ciate the low standard deviations achieved by the SWATH MS spiked-in reference, and peak shape similarity with spiked-in refer- quantification. The SWATH MS data analysis consisted of extracting ence. The peptide retention time (predicted, e.g., with SSRCalc or the fragment ion traces of the precursors from the corresponding experimental when available in spectral libraries) may be used to swath and in reporting the peak areas automatically integrated by reduce the chromatographic space where to look for the targeted PeakView for those extracted chromatograms. To not introduce any peak group (similarly to scheduled versus nonscheduled SRM). This is bias in the data analysis, the calculation of the abundance fold not absolutely necessary for intense signals, but the gain in identifi- changes of the proteins was exactly copied from that of the SRM cation validation can be important for lower intensity signals (or for study (15). In short, for each peptide transition, the ratio of the light noisy fragment ion signals over the LC space). To anticipate when a over heavy peak areas was calculated individually for each sample peptide is expected to elute, a simple retention time re-alignment for (samples 1 and 8); the abundance fold change for each transition each gradient or column can be performed to recalculate the retention was then calculated by dividing each transition ratio of sample 8 by time relative to its RT available in the spectral library. This is typically the corresponding ratio of sample 1; the final abundance fold done by using a set of reference peptides (relatively to which the change of a protein was then calculated by averaging the individual retention time of the peptides of the spectra library were recorded), abundance fold change of each of its transitions. The raw values which are spiked into each sample prior to the SWATH MS measure- (nondenoised and nonsmoothed) of those extracted peak areas in ment. Those peptides are used to recalibrate the retention times for each of the samples are provided in supplemental Tables 4 and each specific run and to help to restrict the extraction of peptides 6–8. from the library to a reasonable elution time window in each SWATH To query the SWATH MS data sets for the 60 yeast mitochondrial MS run. In contrast to SRM measurements, the inclusion of such proteins involved in oxidative phosphorylation of the respiratory chain reference peptides does therefore not consume additional data ac- (as listed in the Kyoto Encyclopedia of Genes and Genomes website, quisition time. http://www.genome.jp/kegg/), we used 287 proteotypic peptides as- Finally, several SWATH MS-specific additional criteria may be used says that were available in our spectral libraries. The relatively low to confirm (or invalidate) the peptide identification: confirmation that success rate for the identification of these mitochondrial proteins (36 the extracted fragment ions correspond to monoisotopic signals and proteins identified over 60) can be explained by the fact that the verification of the charge state of those fragments in the full MS/MS protocol used for the preparation of those samples was at that time spectra extracted from the swath around the apex of the candidate originally devised for the quantification of the metabolic enzymes peak group, assessment of the mass accuracy (typically 5–10 ppm) analyzed by SRM (15) and was therefore not optimized for the recov- of the extracted fragment ions, etc. A step-by-step tutorial describing ery of membrane mitochondrial proteins. the manual or automated targeted data analysis of SWATH MS data The raw (.wiff) files of the diauxic shift samples 1 and 8 acquired in sets is provided in the supplemental materials. SWATH MS mode may be downloaded from ProteomeCommons.org Database Search of the Reference Peptides of the Dilution Series— Tranche using the following hash code: He8q40Zqudc27nm The Mascot database search analysis for the reference peptide of the V1fUpqMhPmhVzVVlYqDMNFKcI9dVSZzGkInuXjK9Mg7iBexSZ6eY dilution series was performed on Mascot v. 2.4 with a self-compiled mGCRkYYp5TgSD6FTqcC5qW2sAAAAAAAAGCg. database comprising the 61 reference peptides grouped in three artificial proteins based on their abundances as reported in the Weiss- RESULTS man list (32). The enzyme selected was trypsin with no missed cleav- We describe a new concept for the accurate, reproducible, age. A search tolerance of 50 ppm was specified for the peptide mass tolerance and 0.05 Da for the MS/MS tolerance. The charges of the high throughput identification and quantification of proteomes peptides to search for were set to 2,3, and 4. The search was by mass spectrometry. It combines a high specificity data- set on monoisotopic mass. The instrument was set to ESI-QUAD- independent LC-MS/MS acquisition method with a targeted TOF. The following modifications were specified for the search: car- data extraction and analysis strategy. bamidomethyl cysteines as fixed modification, C-terminal heavy ly- sine, C-terminal heavy arginine, and oxidized methionine as variable Data-independent Data Acquisition modification. Diauxic Shift Samples for the Quantification Accuracy Assess- The acquisition method essentially extends the DIA ap- ment—The diauxic shift samples used in our experiments were the proach initially described by Venable et al. (18). It consists of same than those analyzed by SRM in an earlier study (15). The sample recording consecutive high resolution fragment ion spectral set prepared for that study consisted in the tryptic digest of a mixture Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–5 Targeted Data Extraction for Proteome Analysis FIG.1. SWATH MS data-independent acquisition and targeted data analysis. A, the data-independent acquisition method consists of the consecutive acquisition of high resolution, accurate mass fragment ion spectra during the entire chromatographic elution (retention time) range by repeatedly stepping through 32 discrete precursor isolation windows of 25-Da width (black double arrows) across the 400–1200 m/z range. The series of isolation windows acquired for a given precursor mass range and across the LC is referred to as a “swath” (e.g., series of 10.1074/mcp.O111.016717–6 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis spectra of all precursors within a user-defined precursor ion tion of the fragment ion signals (supplemental Fig. S2). Using window. This is achieved by stepping the precursor isolation computer simulations, we assessed whether the signals in the window of a quadrupole-quadrupole TOF instrument in 25-Da complex fragment ion maps acquired by SWATH MS were increments (defining the swath width) recursively during the sufficiently specific to support conclusive identification and entire LC separation (Fig. 1A). At 100-ms accumulation time quantification of peptides. As a benchmark, we used the per swath, the quadrupole-accessible 400–1200 m/z range is specificity and accuracy levels of SRM, the gold standard MS covered in 32 steps for a total cycle time of 3.2 s, which is quantification method. With the tool “SRM-Collider,” we sufficient to reconstruct the 30-s chromatographic peak of computed the occurrence of fragment ion interferences for each analyte for accurate quantification. The data structure various combinations of precursor isolation window width and can thus be conceptualized as 32 successive MS2 maps fragment ion mass accuracy. This tool extends the principle of consisting of the composite fragment ion spectra from all the the unique ion signatures described by Sherman et al. (29) by analytes fragmented in each swath (Fig. 1B). Similar to other taking into account peptide RT as an additional constraint windowed DIA methods (20, 23), consecutive swaths were for the calculation of fragment ion interferences. As the basis acquired with some precursor isolation window overlap to for the simulations, we computed theoretical fragment ion ensure the transfer of the complete isotopic pattern of any spectra for 93,875 doubly and triply charged precursors cor- given precursor ion in at least one isolation window and to responding to the tryptic peptides of 3,898 yeast proteins thereby maintain optimal correlation between parent and frag- reported in the PeptideAtlas database (www.peptideatlas. ment isotopes peaks at any LC time point (supplemental Fig. org). Those represent essentially the complete yeast pro- S1, a–f). This overlap was reduced here to a mere minimum of teome observable by mass spectrometry (30) and constitute 1 Da. This value experimentally matched the almost square therefore a realistic proteomic background. Cumulative plots shape of the fragment ion transmission profile (supplemental depicting the percentage of peptides observable with a given Fig. S1, g and h), which was achieved through specific qua- number of interference-free transitions as a measure for cor- drupole tuning purposely developed for SWATH MS. Finally, rect peptide identification and quantification are shown in Fig. to ensure optimal fragmentation for the broad range of pre- 2A for SRM (0.7- and 1-Da isolation widths for precursor and cursors co-selected within each isolation window, a 15 eV fragment ions, respectively) and SWATH MS (25-Da swath ramping of collision energy was used, centered around the width, 10-ppm fragment ion accuracy) scenarios. A histogram optimal collision energy required to fragment a doubly representing the percentage of peptides with five or more charged precursor centered in the middle of the isolation interference-free transitions is shown in Fig. 2B. Both figures window. show that SWATH MS provides a fragment ion specificity that Like other DIA methods (17–24), SWATH MS performance is comparable with that achieved with standard SRM setups. is directly impacted by the width of the precursor isolation Because the extensive shotgun data sets from yeast pro- window. In principle, large isolation windows are preferable to teome mapping studies possibly underestimate the complex- cycle through a wider precursor mass range with faster cy- ity of real samples, we compared the specificity of SWATH cling rates or with increased dwell times. However, large MS fragment ion maps with the specificity achievable by SRM isolation widths increase the number of precursors concur- in a more complex background. The simulations were re- rently fragmented in the respective window, increasing the peated by including all of the doubly and triply charged pre- likelihood of overlap of fragment ions from different precur- cursors (194,314 precursors, corresponding to 6,557 proteins sors (fragment ion interference). The rate of fragment ion (data from ensembl.org) of a complete in silico yeast tryptic interference also depends on the mass accuracy and resolu- digest. As expected, the extent of fragment ion interferences the red double arrows). The cycle time is defined as the time required to return to the acquisition of the same precursor isolation window. Note that the dotted line before the beginning of each cycle depicts the optional acquisition of a high resolution, accurate mass survey (MS1) scan. B, representation of the actual data acquired in one swath (450–475 m/z range) shown here as an MS2 map, with retention time as the abscissa, fragment ion m/z as the ordinate, and ion intensity represented by color intensity. The darker horizontal band visible between 450 and 475 m/z corresponds to residual precursor ions for this swath. The signals co-eluting in the vertical direction are likely fragment ions originating from the same precursor ion. C, the targeted data analysis consists of retrieving the most intense fragment ions of a peptide of interest from a spectral library (list of fragment masses for the N-labeled peptide WIQDADALFGER or the corresponding C-terminal isotopically labeled reference) and extracting those fragment ion traces in the appropriate 700–725 swath using a narrow m/z window (e.g., 10 ppm). These fragment ion traces can be plotted as overlaid extracted ion chromatograms, similarly to SRM transitions. The peak group displaying the best co-eluting characteristics and matching best to the peak group of extracted reference fragment ion traces identifies and quantifies the target peptide. D, the complete high resolution, accurate mass fragment ion spectra underlying the best candidate peak group can be extracted from the raw data. These spectra can be inspected to confirm that the extracted signals originate from mass accurate monoisotopic fragment ion with the right charge state (e.g., lower panel zooms on the y4 (green box) and y10 (blue box) fragment, with the endogenous and reference peptide fragments annotated with open or closed circles, respectively). They can also be extensively annotated to strengthen the identification of the peptide (top panel). Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–7 Targeted Data Extraction for Proteome Analysis FIG.2. Simulated fragment ion interferences for various LC-MS/MS acquisition scenarios. A, fragment ion interference cumulative plots are computed as described under “Materials and Methods” by taking into account fragments ions from doubly charged yeast tryptic peptide precursors against the fragment ions from the doubly and triply charged yeast tryptic peptides reported in PeptideAtlas (www.peptideatlas. org). The distribution of peptides with specific numbers of interference-free transitions are shown for the following simulations (precursor and fragment ion isolation respectively): 0.7 Da/0.7 Da (open diamonds), 25 Da/10 ppm (black squares), 1 Da/1 Da (open triangles), 2.5 Da/1 Da (crosses), 10 Da/1 Da (asterisks), and 800 Da/10 ppm (open circles). Simulation plots for other background or acquisition scenarios are available in supplemental Fig. S3. B, the fraction of peptides observable with five or more interference-free transitions for the various acquisition scenarios is presented in the histogram with white bars. Accordingly, the shaded bars represent the fraction of peptides having less than four interference-free transitions. with this more complex background was higher for the differ- Da-wide swath and 20–30-s RT elution segment of that pre- ent scenarios. However, the relative specificity offered by cursor. Using the 93,875 yeast tryptic precursors from the SWATH MS versus SRM remained qualitatively the same PeptideAtlas database, the simulations indicated that, for (supplemental Fig. S3). 75% of the peptides, more than 20 additional precursors As a comparison, we checked whether previous DIA meth- (median  40) were expected to be present in the specified ods would also provide sufficient fragment ion specificity to window (supplemental Fig. S4). These numbers illustrate the support the identification of peptides using a targeting data extent of precursor co-selection, and by inference, the frag- analysis strategy. We simulated the fragment ion interfer- ment ion spectral complexity that is generated when wide ences for various sequential windowed DIA methods on low isolation windows are used. These simulations suggest that resolution instruments (scenarios with 2.5-Da/1-Da, or 10-Da/ analyzing such data sets with traditional DIA database search 1-Da swath width and fragment ion accuracy, respectively) or strategies remains highly challenging. for DIA methods on high resolution instruments without iso- To analyze the SWATH MS data sets, we therefore imple- lation window (scenario with 800-Da swath width and 10-ppm mented a data mining strategy that is conceptually similar to fragment ion accuracy). Fig. 2 and supplemental Fig. S3 show targeted mass spectrometry by SRM. However, in contrast to that none of the former DIA methods are able to reach the SRM, the signals used for peptides identification and quanti- level of fragment ion specificity of SRM or SWATH MS and are fication are specified postacquisition and can therefore be therefore not amenable to accurate targeted data mining with- flexibly adapted or optimized. The data analysis strategy is out prior raw data filtering. schematically illustrated in supplemental Fig. S5. The pro- cess starts by selecting, from reference spectral libraries such Targeted Data Analysis of SWATH MS Fragment Ion as SRMAtlas (31), a suitable set of fragment ions from pep- Maps tides proteotypic for the proteins of interest. In SRM, those fragment ion masses are transition coordinates for the tar- Using the same rationale used above for the simulation of fragment ion interferences in MS2 maps, we computed the geted acquisition. In SWATH MS, those fragment ion masses overall precursor ion distribution in the LC-MS1 space. For are used to extract ion chromatograms from the acquired data this, we counted for each precursor the number of doubly and sets that are then combined into an identifying peak group. triply charged peptides concurrently coinciding within the 25- Fig. 1C provides an example of ion traces for the four most 10.1074/mcp.O111.016717–8 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis FIG.3. Limit of detection and intrascan dynamic range. A, the areas (y axis) of the precursor ion extracted from the survey scan (open squares) and of the most intense fragment ion extracted from the SWATH MS (closed triangles) and SRM (black crosses) quantifications are shown for the different serial dilution experiments (injected amounts of the peptide ELGQSGVDTYLQTK diluted in a yeast tryptic background in the x axis). The Mascot scores of the peptide identified in the same dilution series samples but acquired in DDA mode are shown as open circles. The limits of detection for the different methods are indicated with dotted lines. The complete series of LOD plots and corresponding lists of peak areas for the precursor and fragment ion traces quantified during these dilution series experiments are provided in supplemental Table 1 for the full set of 61 reference peptides. B, similar quantification plot for the doubly isotopically labeled peptide AADITSLYK serially diluted in a yeast tryptic background is shown here for the most intense fragment ion with closed triangles (“LOD control”). The intrascan dynamic range experiment consists of a dilution series of the same peptide AADITSLYK (open squares, “intrascan diluted”) in the presence of a constant amount of a singly isotopically labeled peptide AADITSLYK (open diamonds, “intrascan constant”), in the same yeast tryptic background. The complete lists of peak areas for the precursor and fragment ion traces quantified during the dilution series and intrascan dynamic range experiments are provided in supplemental Table 2. Screenshots of the quantified fragment ion traces and of the MS/MS spectra (zoomed around the y7 fragment) underlying the peptide peak apex are provided in supplemental Fig. S6 for the sample sets of the intrascan dynamic range experiment. intense fragments of the endogenous peptide WIQDADALF- Performance of SWATH MS Coupled to Targeted Data GER that is proteotypic for yeast protein RIR2. The protein Extraction has an expected abundance of 500 copies per cell (32). The Limit of Detection, Limit of Quantification, and Intrascan traces were extracted from a N-labeled yeast tryptic digest Dynamic Range—The LOD of the method was assessed by data set acquired by SWATH MS, specifically in the swath measuring dilution series of 61 reference peptides containing 700–725 that contained the 719.318 m/z doubly charged isotopically labeled lysine or arginine C termini, spiked into a precursor. The data show that around the RT of 53.9 min, the background of a N-labeled yeast tryptic digest. Among extracted ion chromatograms form a peak group that identi- those, 38 peptides were serially diluted, covering a range of fies the queried peptide, based on the same criteria com- 47 fmol to 91 amol, and 23 were kept constant at 47 fmol monly used by automated SRM analysis tools (e.g., mProphet each. The samples were subjected to SWATH MS acquisition, (9) or Skyline (33)) such as co-elution of the fragment ions and the ion traces for the three most intense fragment ions for traces, correlation of the relative fragment ion intensities with each reference peptides were extracted and integrated. The those of reference spectra, and more. The identification can resulting dilution plots show a limit of detection (signal to noise be further strengthened by checking the co-elution with a ratio above 3) and a limit of quantification (deviation from line- reference peptide spiked into the sample (Fig. 1C)orby arity above 30%) in the amol range for eight of the diluted extensively annotating the full fragment ion spectra implicitly peptides (Fig. 3A and supplemental Table 1). The coefficient of present in the SWATH MS data at that RT (Fig. 1D). As in variance was estimated as 13.7% for the peptides spiked at SRM, the quantification is intrinsically linked to the peptide constant concentrations (supplemental Table 1). identification (supplemental Fig. S5) and proceeds by integra- Next, we determined the intrascan dynamic range of the tion of the fragment ions traces across the chromatographic method, i.e., the fold change range separating the highest and elution of the validated peak group, with the optional use of isotopically labeled references for relative or absolute lowest signal intensities concurrently observable within a quantification. same fragment ion spectrum. For this, an isotopically labeled Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–9 Targeted Data Extraction for Proteome Analysis peptide pair was chosen such that both (co-eluting) precur- sensitive triple-quadrupole instrument operating in SRM sors were co-selected within the same swath. The samples mode (supplemental Table 3). This comparison showed that consisted of a yeast tryptic digest spiked with one peptide at SRM was 10-fold more sensitive, placing SWATH MS cou- a constant amount of 1.25 pmol loaded on column, whereas pled to targeted data extraction between SRM and MS1/ the isotopic counterpart was diluted in a stepwise manner label-free quantification workflows in terms of sensitivity. (supplemental Table 2 and supplemental Fig. S6). The data Quantification Accuracy of SWATH MS-targeted Data Anal- were acquired in SWATH MS mode and analyzed as de- ysis—Next, we sought to benchmark the quantification accu- scribed above. Fig. 3B shows that the diluted peptide species racy of SWATH MS targeted analysis to that of SRM, the gold could be detected and quantified linearly through a dynamic standard mass spectrometric quantification method. For this, range of almost 4 orders of magnitude. It is noteworthy that we reacquired, via SWATH MS, samples 1 and 8 correspond- the signal did not demonstrate saturation even at the highest ing to the start and end points of a yeast diauxic shift exper- peptide concentration, suggesting that dynamic range could iment previously analyzed by SRM (15). Those samples con- be further extended by using higher peptide concentrations. sisted of tryptic digests of a mixture of (i) a lysate of yeast cells Thus, the sensitivity of the method seems so far limited by the grown in regular N medium and sampled throughout the chemical or electronic noise of the measurement itself rather metabolic shift from fermentation to respiration and (ii) a con- than by intrascan dynamic range considerations. stant N-labeled yeast lysate as internal standard for the fold We then compared the performance of SWATH MS with change calculations. As a first pass analysis, the SWATH MS that of other standard proteomic strategies. For the compar- data set was mined with the exact same set of 476 transitions ison with DDA, the LOD dilution series samples described used to quantify the fold change of 80 peptides (45 metabolic above were analyzed on the same MS instrument running in enzymes) in the SRM study (15). From this initial data mining, “top 20” shotgun mode. The data were searched with Mascot, mProphet automated analysis could validate 64 peptide iden- and the identification score for the 61 reference peptides was tifications (1.5% false discovery rate; supplemental Fig. S8). reported on the same plots as those from the SWATH MS- Upon visual inspection of the extracted fragment ion traces, extracted fragment intensities (supplemental Table 1). The we could confirm the quantification for 40 proteins (72 pep- results indicate that, for 26 of the 38 diluted peptides, the tides), whereas 5 proteins (8 peptides) were not convincingly database searches failed to identify the reference peptides detectable with this initial set of transitions (supplemental even when those were spiked at concentrations that were Tables 4 and 5). 2–10 fold higher than the respective LOD in the SWATH MS Unlike SRM data, SWATH MS data sets contain transition data sets. It is noteworthy that all the missing peptide identi- signals different from those originally extracted and fragmen- fications were actually due to nonselected signals for MS/MS tation information for other peptides than those originally tar- sequencing. This experimentally demonstrates that precursor geted. Taking advantage of this, we re-extracted, from the ion detection/picking in the MS1 scans is less reliable than exact same two files, additional or alternative peptide frag- fragment ion signal extraction from the MS/MS scans. ment ion traces for proteins whose identification and/or quan- To compare the performance of SWATH MS with that of tification was compromised because of fragment ion interfer- label-free workflows, we integrated the precursor ion traces ences or low signal to noise ratio during the primary data extracted from the MS1 scans present in the exact same set extraction. This straightforward data reanalysis rescued quan- of files acquired by SWATH MS for the dilution series sam- tification information for three of the five undetected proteins, ples. For the acquisition of this data set, a survey scan was by quantifying nine novel peptides (supplemental Table 6) and carried out at the beginning of each swath cycle using the significantly improved the quantification accuracy for the same periodicity (3.2 s) and accumulation time (100 ms) also three proteins displaying the highest standard deviations in applied per swath window (Fig. 1A), thus providing the closest the primary analysis (supplemental Table 7). Fig. 4A summa- quantification comparison possible. The MS1 areas were re- rizes the final quantification results and confirms that enzymes ported on the same plots as the SWATH MS-extracted frag- from the glycolysis pathway show a slight (maximum 2-fold) ment intensities (supplemental Table 1). The results show down-regulation, whereas those involved in the glyoxylate that, in half of the cases (for 19 of the 38 diluted reference and citric acid cycles show between 10- and 300-fold up- peptides), SWATH MS quantification at the fragment ion regulations, consistent with the data of the SRM study. For a spectra level offers a 2–8-fold gain in sensitivity compared more direct comparison with the SRM results, we also plotted with the LOD based on precursor ion signals detected in the the proteins fold changes quantified with SWATH MS targeted MS1 maps. Supplemental Fig. S7 provides such an example analysis against those published in the SRM study. The cor- of diluted peptide (ANLIPVIAK) whose precursor is only de- relation plot (Fig. 4B) shows an excellent linear correlation tectable until 1.5 fmol in the MS1 scans, whereas its fragment between the quantification results (slope  0.9, r 0.95) and ions are still unambiguously identifiable and quantifiable down benchmarks the quantification accuracy obtained by SWATH to 180 amol by targeted data extraction of the MS/MS scans. MS targeted analysis to the level of quality delivered by SRM Finally, the LOD dilution series were analyzed on our most data acquisition. 10.1074/mcp.O111.016717–10 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis FIG.4. Quantification by SWATH MS of the abundance fold changes of 45 enzymes involved in the yeast central carbon metabolism during a diauxic shift experiment. A, schematic representation of the yeast central carbon metabolism network. The abundance fold changes of the enzymes quantified by SWATH MS (supplemental Tables 4–7) are coded with colors. The box shapes are indicative of the absolute abundances of the proteins determined for yeast in log phase growth (32). Those values constitute therefore an approximation of the absolute abundances of the enzymes at the beginning of the diauxic shift experiment, and based on which the fold changes are determined. B, correlation between the abundance fold changes of the same metabolic enzymes quantified by SRM (15) (abscissa) and SWATH MS (ordinate). The linear regression was calculated with the refined SWATH MS quantification values and without taking into account the fold changes for the proteins whose peptides presented light signal intensities below noise levels for the first time point of the diauxic shift (open diamonds), similar to the SRM study (15). Because the standard deviations for the SWATH MS and SRM quantifications are not directly comparable (see “Materials and Methods”), they are provided on separate plots in supplemental Table 5. The complete lists of peak areas for the quantified fragment ion traces and corresponding protein abundance fold changes are provided in supplemental Tables 4–7. by a first pass biological review of the data, a situation that is To demonstrate the effect of the fragment ion mass accu- common for systems biology studies. To illustrate this capa- racy and resolution on the quantification performance, we bility, the diauxic shift data sets were queried for 60 yeast artificially relaxed the resolution of the SWATH MS measure- mitochondrial proteins (287 peptides) involved in oxidative ments, postacquisition, to mimic either a data-independent ac- phosphorylation of the respiratory chain. These were not cov- quisition on a high resolution instrument but without isolation ered in the initial SRM study but were a posteriori considered window or a windowed acquisition on a low resolution instru- relevant in the context of the switch from fermentation to ment (simulating the conditions of MS /AIF (19, 21) or DIA (18) respiration that occurs upon the diauxic shift. The data re- data sets, respectively, see Table 1). This was achieved in silico analysis consisted of extracting, from the same diauxic shift either by recombining the swaths prior to fragment ion chro- files, fragment ion traces of those targeted peptides for which matogram extraction at 10-ppm mass accuracy or by extracting we had assay records in our yeast spectral libraries. From the the swaths data at 1-Da mass accuracy, respectively. The list of mitochondrial proteins, we could confidently quantify mProphet analysis results (supplemental Fig. S9) show that the abundance fold change for 36 proteins (103 peptides), 19 neither of those low specificity acquisition methods can match of which were membrane-associated proteins from the respi- the number of identifications and quantification accuracy levels ratory chain (Fig. 5 and supplemental Table 8). As for the achieved by SWATH MS, especially for the proteins of low previous analysis, the abundance fold change was measura- abundance. ble for proteins spanning a wide range of protein abundances Extending the Set of Quantified Proteins from SWATH (from 395 to 8.8E copies/cell (32)). MS Data Sets Identification of Post-translational Modifications SWATH MS data sets implicitly contain a permanent frag- ment ion spectral record for all precursors within the mass It is noteworthy that peptide modifications may also appear and hydrophobicity range covered by specific LC-MS/MS serendipitously as result of the targeted data extraction of acquisition conditions. This allows, in principle, for probing SWATH MS data sets. When the fragment ion traces used to the data sets in silico for any new protein of interest suggested query a peptide are shared with modified forms of that pep- Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–11 Targeted Data Extraction for Proteome Analysis FIG.5. Extended quantification by SWATH MS of the abundance fold changes of mitochondrial enzymes during a diauxic shift experiment. Schematic representation of the respiratory chain and oxidative phosphorylation networks inspired by the Kyoto Encyclopedia of Genes and Genomes pathway representation (47). The abundance fold changes of the enzymes quantified by SWATH MS are coded with colors. The box shapes are indicative of the absolute abundances of the proteins with the same notices as those mentioned in Fig. 4. The complete list of peptides and of their fragment ions used to quantify those proteins, as well as the peak areas and corresponding protein abundance fold changes are provided in supplemental Table 7. tide and when those are extracted in the same swath, multiple DISCUSSION peak groups matching the original query can be observed. Among the various MS-based proteomic approaches, SRM Fig. 6 illustrates such a case for the N-labeled (light) and is generally recognized as providing the most accurate and N-labeled (heavy) forms of the endogenous peptide MIEIM- reproducible quantification results. The high degree of repro- LPVFDAPQNLVEQAK (proteotypic for protein PDC1), queried ducibility is granted by the consistent recording, across the in the yeast diauxic shift sample 8 (late time point). Additional, LC, of the intensities of predefined target fragment ions. This nonshared fragment ions can then be re-extracted to distin- allows consistent tracking the abundance of specific peptides guish which peak group corresponds to the nonmodified or of interest across multiple samples. At present, however, SRM modified peptides, respectively (supplemental Fig. S9). In suffers from relatively slow analysis rates and lacks the capa- cases where the modified peptide is fragmented in a different bility to dynamically refine or expand the measured peptides/ swath, the shared fragment ion masses may still be used to proteins for extensive proteome investigations. To alleviate specifically query for the modified peptide form in that swath. most limitations of targeted data acquisition, we propose here a These data illustrate the potential of SWATH MS targeted data targeted data analysis strategy that brings the consistent and extraction for unambiguous modification site assignment by accurate quantification capabilities of SRM to a level of exten- extracting specific fragment ions characteristic of the modi- sive proteome coverage by mining the complete fragment ion fied peptide sequence. This opens completely novel oppor- records generated during data-independent acquisition. tunities to discover (and quantify) unanticipated modified pep- Not all DIA methods may be appropriate for targeted data tide species from DIA data sets by a strategy that does not suffer from the combinatorial explosion of the search space extraction. To reach the quantification accuracy of SRM with usually experienced with traditional post-translational modifi- targeted data extraction, the LC-MS/MS acquisition has to cation database search approaches. provide fragment ion data of a level of specificity that is 10.1074/mcp.O111.016717–12 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis FIG.6. Application of SWATH MS targeted analysis to identify peptide modifications. The six most intense fragment ion traces for the 14 15 N-labeled (light) and N-labeled (heavy) forms of the peptide MIEIMLPVFDAPQNLVEQAK, extracted from the swath 750–775, are shown for the yeast diauxic shift sample y8 (late time point). None of the classical SRM criteria (fragments co-elution, light-heavy peptide co-elution, relative intensities of the fragment ions) can discriminate the three candidate peak groups found here. By extracting additional, nonshared fragment ion traces, the identification of the peptide can be confirmed, and the site of the oxidized methionine modification can be unambiguously assigned onto the peptide sequence (supplemental Fig. S10). comparable with that of SRM. Based on our fragment ion similar to those achieved by SWATH MS may very well be interference simulations (Fig. 2), we adopted a sequential reached by other DIA methods, upon higher resolution of window DIA method operating with 25-Da isolation width. On co-eluting analytes (e.g., using multidimensional protein a fast, high resolution MS instrument, this setup allows doc- identification technology (MudPIT) (18), ultrahigh pressure umentation, in a single injection, of highly specific and time- liquid chromatography (UPLC) (19), or ion mobility shift), resolved fragment ion data for all the precursors within the although the gain in fragment ion specificity offered by 400–1200 m/z mass and the monitored LC range (Fig. 1). The extensive fractionations was recently questioned (12). data thus generated constitute a series of extensive fragment To mine the fragment ion maps recorded during SWATH ion maps ideally suited for proteome-wide investigation by MS acquisition, we devised a targeted data extraction strat- targeted data analysis. DIA acquisition using consecutive egy that conceptually transposes, to the data analysis, prin- swaths is not novel per se (18, 20). However, its rationally ciples originating from SRM targeted acquisition. This tar- designed implementation on a fast, high resolution MS instru- geted data analysis strategy differs fundamentally from the ment provides, for the first time for a DIA method, the level of traditional search approaches described so far to analyze DIA data quality necessary for targeted data extraction. This ac- data sets. Specifically, this type of analysis does not rely on quisition method is now commercially available on the precursor ion mass detection nor involve MS/MS spectra ABSciex 5600 TripleTOF instrument under the SWATH MS matching of any kind (neither using traditional database denomination. searching tools nor spectral matching algorithms). Instead, it It should be noted that this SWATH MS setup (recording consists of extracting, from the SWATH MS data sets, several 32 swaths of 25 Da at 100-ms dwell time) is only one of fragment ion chromatograms for each peptide of interest. many acquisition sets that can be applied. Like other mass Collectively, these trace groups identify the targeted peptide, spectrometric methods, SWATH MS operates within a as in SRM analysis (Fig. 1C). Because both the peptide iden- space of interdependent parameters, including dwell time, tification and quantification are performed at the MS/MS level, duty cycle, and precursor isolation window width that affect without the precursor ion signal having to be explicitly detected the limit of detection, signal specificity, dynamic range, and in the survey scans, this strategy allows extensive exploration of quantification accuracy. Depending on the biological appli- the multiplexed MS/MS DIA data sets to a level that was not cation or sample complexity, other parameters, including possible with the traditional clustering/database approaches. windows of variable widths throughout the LC gradient, This targeted extraction strategy, like SRM, depends on might prove more efficient. Also, fragment ion specificities spectral libraries as prior knowledge, to guide the selection of Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–13 Targeted Data Extraction for Proteome Analysis the optimal set of fragment ion signals. For several species, analytes in complex samples because the detection and proteome-wide reference spectral libraries have been com- quantification is based on fragment ion signals without the pleted and will be made public in the near future. These explicit need to detect the precursor ion in a survey scan 2 3 libraries are S. cerevisiae, human, and Mycobacterium tu- above noise. berculosis. Given that robust and high throughput methods The intrascan dynamic range of the method was also ex- for the generation of such libraries have been developed (e.g., perimentally assessed and was shown to cover almost 4 by systematically recording MS/MS reference spectra of orders of magnitude (Fig. 3B). Such extent of identification chemically synthesized proteotypic peptides (34)), we antici- (and quantification) of co-eluting peptides spanning 4 logs of pate that proteomes of additional species will be equally concentration, reliably detected here with targeted data ex- mapped out in the near future. Alternatively, spectral libraries traction (supplemental Fig. S6), may be more challenging to may be generated for any sample by extensive DDA analysis achieve with traditional DIA analysis approaches relying on using the same instrument. To increase the reliability of such fragment ion preclustering and/or regular database searches. libraries, consensus spectra can be generated from repeated To our knowledge, this is indeed the first attempt to objec- observations of the same peptide using freely available tools tively evaluate the intrascan dynamic range of peptide iden- (35–37). The use of reference spectra as a priori information tification/quantification for a DIA approach, even though this guiding the targeted extraction of DIA data sets may be less parameter is of utmost importance for proteome analyses, in error-prone than approaches relying on clustering the frag- particular if wide precursor isolation windows are being used. ment and precursor ions based on their LC elution profiles. It is noteworthy that the most abundant precursor actually Indeed, targeted data extraction can identify and quantify two limits the dynamic range only for its specific isolation window exactly co-eluting peptides (e.g., light and heavy labeled pep- and therefore does not affect the detection sensitivity achiev- tide forms), even if they are present at vastly different abun- able simultaneously in other swaths. Thus, for SWATH MS, an dance levels (Fig. 3B), a situation that challenges clustering even greater dynamic range may be anticipated throughout approaches (26) and requires recursive search implementa- the 400–1200 m/z range at each time point and across the tions to deconvolute the multiplexed spectra (38). LC-MS range as a whole. A wide intrascan dynamic range To evaluate the limit of detection of the method, a set of achievable in flow-through instruments like the quadrupole- isotopically labeled serial dilution experiments was performed quadrupole TOF instrument used in this study might be diffi- and showed that SWATH MS acquisition coupled to targeted cult to achieve with trapping instruments. Their limited ion data analysis could identify and quantify peptides down to the trapping capacity restricts the number of peptide species that hundred amol range (Fig. 3). Even though the method in its can be concurrently analyzed without compromising perform- current setup was slightly less sensitive than SRM, it remains ance through space charging. On quadrupole-quadrupole to be determined whether the systematic optimization of the TOF instruments, the ions are transferred through a quadru- SWATH MS acquisition parameter sets, e.g., the use of dy- pole to the collision cell and to the TOF analyzer, irrespective namically adjusted window widths and increased dwell times, of the number or abundance of co-selected precursors, a can further improve the LOD of the method. Generally, per- feature that is critical for reaching a high intrascan dynamic formance comparisons of methods are problematic if the range with DIA methods using large isolation windows that comparisons include too many variables such as different produce high ion fluxes. Also, an optimal “square shape” for samples or instrument types, instrument settings, etc. We the ion transmission efficiency (as achieved here by decou- therefore compared data acquired by SWATH MS with data pling the DC and RF voltages of the isolation quadrupole; generated by DDA and by MS1 quantification using aliquots of supplemental Figs. S1) might be difficult to maintain through- the same sample measured on the same ABSciex 5600 Trip- out the entire isolation window width on current trapping leTOF instrument. Overall, SWATH MS outperformed the two devices and may require larger overlaps between adjacent other methods for the consistent detection and quantification swaths to ensure consistent quantification of the analytes of low abundance precursors, especially if complex samples transmitted at the border of the isolation windows. Therefore, were analyzed (Fig. 3 and supplemental Figs. S6 and S7 and whereas the principles of data-independent acquisition with supplemental Tables 1 and 2). This result corroborates obser- swaths can conceivably be implemented on different types of vations from previous DIA reports (18, 20, 22, 24) and can be mass spectrometers, it appears that the characteristics of explained by an increased signal to noise ratio in the fragment flow-through systems like quadrupole-quadrupole TOFs are ion maps compared with the survey scans. This also empha- currently the best match for the method. sizes that unbiased acquisition methods such as SRM and More importantly, we evaluated the quantification repro- DIA are particularly well suited for the detection of low level ducibility achievable by SWATH MS coupled to targeted data analysis and its potentials for proteome quantification for biology. Comparing SRM- and SWATH MS-derived quantita- P. Picotti, et al., submitted for publication. tive values obtained from the same isotope labeled samples R. Aebersold, R. Moritz, et al., manuscript in preparation. (two yeast diauxic shift samples previously analyzed by SRM O. Schubert, J. Mouritsen, et al., manuscript in preparation. 10.1074/mcp.O111.016717–14 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis (15)), both methods showed highly correlated values (Fig. 4B). are neither relevant nor accessible to quadrupole resolution Overall, SWATH MS coupled to targeted data extraction al- used in SRM acquisition, they are instrumental to adequately lowed consistent quantification of proteins spanning a wide mine SWATH MS data sets. Therefore, a more complete and range of concentrations, e.g., 125–10 copies/cell (Fig. 4A, specific targeted data analysis pipeline is required before box shapes). Unlike SRM data, SWATH MS data sets are attempting exhaustive qualitative and quantitative proteome permanent records of the fragment ion spectra of a sample characterization of SWATH MS data sets. that can be re-examined in silico without the need for further The concept of SWATH MS acquisition and targeted data data acquisition. This characteristic, specific to DIA data sets, analysis should be easily extendable to other classes of opens new possibilities to rescue missing quantification infor- biomolecules such as metabolites, lipids, and more that are mation and to improve the accuracy of initial quantification also frequently studied by LC-MS/MS and for which fragment results simply through iterative targeted data reanalysis, as ion spectral libraries have been developed (41–46). Also, the demonstrated here for several metabolic enzymes (supple- possibility to re-examine patterns in the SWATH MS data sets mental Tables 5–7). It has been discussed that, for SRM enables new opportunities for finding modified residues and measurements, interference of contaminating transitions, in- search for the presence of previously unexpected analytes complete tryptic cleavage, or possible modifications of a pep- (Fig. 6). tide or other such artifacts may impede the accuracy of quan- In summary, we report a method for qualitative and quan- tification (39, 40). The optimization of fragment ion sets for titative proteome probing of a sample in a single LC-MS/MS each targeted peptides by the iterative SWATH MS data injection. This is achieved by the combination of a sequential analysis offers practical solutions to these important issues. windowed DIA method, generating exhaustive high specificity Interfering transitions can be detected and eliminated using fragment ion map records, coupled with a postacquisition outlier detection algorithms, and the data set can be queried targeted data analysis strategy. This method permits quanti- for other peptides from the targeted protein or for alternate fication of (at least) as many compounds as those typically peptides, e.g., derived by unspecific or partial cleavage or identified by regular shotgun proteomics with the accuracy modified peptides covering the same segment of a protein. and reproducibility of SRM across many samples. The Once detected, such instances can be eliminated or taken method also provides new possibilities for data analysis, al- into account to achieve higher quality data (supplemental lowing quantification refinement and dynamic protein probing Table 7). by iteratively re-mining the once-and-forever acquired data The possibility of iteratively searching the SWATH MS data sets. sets also supports ad libitum queries for protein sets. Al- though the diauxic shift samples used in this study were not Acknowledgments—We acknowledge Christine Carapito (CNRS, originally intended for the recovery of mitochondrial mem- Strasbourg, France) for early contributions in evaluating the potentials of SWATH MS. We thank Paola Picotti (Institute of Biochemistry, ETH brane proteins (15), we could confidently quantify and assess Zu¨ rich) for providing the diauxic shift samples originating from an the fold changes for 36 proteins involved in the oxidative earlier study (15). We thank Uwe Sauer and Ana Paula Oliveira (Insti- phosphorylation and respiratory networks (Fig. 5). Those pro- tute of Molecular Systems Biology, ETH Zu¨ rich) for suggesting the set teins were not covered in the initial analysis and would have of respiratory chain proteins additionally quantified in the diauxic shift required new targeted data acquisition of the samples by samples. We thank Lyle Burton (ABSciex) for active development of SRM. With the targeted data analysis strategy, the new pro- the PeakView software. L. C. G., P. N., and R. A. designed the study. S. T. implemented and tein set can simply be re-extracted in silico from the existing TM developed the acquisition method on the ABSciex 5600 TripleTOF SWATH MS data files, without the need to reinject the sample. instrument and performed the data acquisitions. H. R. computed the This dynamic extension of the search space applied to theoretical simulations of fragment ion interferences. N. S. performed SWATH MS data sets is expected to be particularly attractive the comparative measurement of the AQUA dilution series by for systems biology studies where new query hypotheses are SRM. L. R. helped implementing the SWATH MS analysis in mProphet. L. C. G. and P. N. performed the SWATH MS data analysis. generated from mathematical models based on prior data R. A. and R. B. supervised the study. analysis. Although it is in principle possible to probe SWATH MS data sets for the whole proteome of an organism at once, * This work was supported by ABSciex; European Union FP7 Pros- it is beyond the scope of this article to describe the exhaustive pects Grant 201648; SystemsX.ch, the Swiss initiative for systems quantification of all the yeast proteins detectable in those biology via the projects YeastX and PhosphonetX; ERC Proteomics diauxic shift samples by SWATH MS. Indeed, although the v3.0 Grant 233226; and European Union FP7 “Unicellsys” Grant 201142. The costs of publication of this article were defrayed in part data are already analyzable with mProphet or Skyline, none of by the payment of page charges. This article must therefore be hereby the currently available SRM analysis tools can so far fully marked “advertisement” in accordance with 18 U.S.C. Section 1734 exploit the information potential contained in SWATH MS data solely to indicate this fact. sets. For example, no SRM analysis pipeline takes into ac- □ S This article contains supplemental material. count the mass accuracy of the fragment ions, nor their iso- ** To whom correspondence should be addressed. Tel.: 41-44-633- 31-70; Fax: 41-44-633-10-51; E-mail: [email protected]. topic distribution or charge state. Although those parameters Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–15 Targeted Data Extraction for Proteome Analysis REFERENCES Miller, S. I., and Goodlett, D. R. (2009) Precursor acquisition independent from ion count: How to dive deeper into the proteomics ocean. Anal. 1. Aebersold, R., and Mann, M. (2003) Mass spectrometry-based proteomics. Chem. 81, 6481–6488 Nature 422, 198–207 21. Geiger, T., Cox, J., and Mann, M. (2010) Proteomics on an Orbitrap bench- 2. MacCoss, M. J., and Matthews, D. L. (2005) Teaching a new dog old tricks. top mass spectrometer using all ion fragmentation. Mol. Cell. Proteomics Anal. Chem. 77, 295A–302A 9, 2252–2261 3. Han, X., Aslanian, A., and Yates, J. R., 3rd (2008) Mass spectrometry for 22. Bern, M., Finney, G., Hoopmann, M. R., Merrihew, G., Toth, M. J., and proteomics. Curr. Opin. Chem. Biol. 12, 483–490 MacCoss, M. J. (2010) Deconvolution of mixture spectra from ion-trap 4. Walther, T. C., and Mann, M. (2010) Mass spectrometry-based proteomics data-independent-acquisition tandem mass spectrometry. Anal. Chem. in cell biology. J. Cell Biol. 190, 491–500 82, 833–841 5. Domon, B., and Aebersold, R. (2006) Mass spectrometry and protein anal- 23. Carvalho, P. C., Han, X., Xu, T., Cociorva, D., Carvalho Mda, G., Barbosa, ysis. Science 312, 212–217 V. C., and Yates, J. R., 3rd (2010) XDIA: Improving on the label-free 6. Kapp, E., and Schutz, F. (2007) Overview of tandem mass spectrometry data-independent analysis. Bioinformatics 26, 847–848 (MS/MS) database search algorithms, in Current Protocols in Protein 24. Panchaud, A., Jung, S., Shaffer, S. A., Aitchison, J. D., and Goodlett, D. R. Science, Chapter 25, pp. 25.2.1–25.2.19, John Wiley & Sons, Inc, Hobo- (2011) Faster, quantitative, and accurate precursor acquisition independ- ken, New Jersey, USA ent from ion count. Anal. Chem. 83, 2250–2257 7. Nesvizhskii, A. I. (2007) Protein identification by tandem mass spectrometry 25. Wong, J. W., Schwahn, A. B., and Downard, K. M. (2009) ETISEQ: An and sequence database searching. Methods Mol. Biol. 367, 87–119 algorithm for automated elution time ion sequencing of concurrently 8. Lange, V., Picotti, P., Domon, B., and Aebersold, R. (2008) Selected reac- fragmented peptides for mass spectrometry-based proteomics. BMC tion monitoring for quantitative proteomics: a tutorial. Mol Syst. Biol. Bioinformatics 10:244, 1–10 4:222, 1–14 26. Geromanos, S. J., Vissers, J. P., Silva, J. C., Dorschel, C. A., Li, G. Z., 9. Reiter, L., Rinner, O., Picotti, P., Hu¨ ttenhain, R., Beck, M., Brusniak, M. Y., Gorenstein, M. V., Bateman, R. H., and Langridge, J. I. (2009) The Hengartner, M. O., and Aebersold, R. (2011) mProphet: Automated data detection, correlation, and comparison of peptide precursor and product processing and statistical validation for large-scale SRM experiments. ions from data independent LC-MS with data dependant LC-MS/MS. Nat Methods 8, 430–435 Proteomics 9, 1683–1695 10. Domon, B., and Aebersold, R. (2010) Options and considerations when 27. Li, G. Z., Vissers, J. P., Silva, J. C., Golick, D., Gorenstein, M. V., and selecting a quantitative proteomics strategy. Nat. Biotechnol. 28, Geromanos, S. J. (2009) Database searching and accounting of multi- 710–721 plexed precursor and product ion spectra from the data independent 11. Liu, H., Sadygov, R. G., and Yates, J. R., 3rd (2004) A model for random analysis of simple and complex peptide mixtures. Proteomics 9, sampling and estimation of relative protein abundance in shotgun pro- 1696–1719 teomics. Anal. Chem. 76, 4193–4201 28. Blackburn, K., Mbeunkui, F., Mitra, S. K., Mentzel, T., and Goshe, M. B. 12. Michalski, A., Cox, J., and Mann, M. (2011) More than 100,000 detectable (2010) Improving protein and proteome coverage through data-indepen- peptide species elute in single shotgun proteomics runs but the majority dent multiplexed peptide fragmentation. J. Proteome Res. 9, 3621–3637 is inaccessible to data-dependent LC-MS/MS. J. Proteome Res. 10, 29. Sherman, J., McKay, M. J., Ashman, K., and Molloy, M. P. (2009) Unique 1785–1793 ion signature mass spectrometry, a deterministic method to assign pep- 13. Addona, T. A., Abbatiello, S. E., Schilling, B., Skates, S. J., Mani, D. R., tide identity. Mol. Cell. Proteomics 8, 2051–2062 Bunk, D. M., Spiegelman, C. H., Zimmerman, L. J., Ham, A. J., Kesh- 30. de Godoy, L. M., Olsen, J. V., Cox, J., Nielsen, M. L., Hubner, N. C., ishian, H., Hall, S. C., Allen, S., Blackman, R. K., Borchers, C. H., Buck, Fro¨ hlich, F., Walther, T. C., and Mann, M. (2008) Comprehensive mass- C., Cardasis, H. L., Cusack, M. P., Dodder, N. G., Gibson, B. W., Held, spectrometry-based proteome quantification of haploid versus diploid J. M., Hiltke, T., Jackson, A., Johansen, E. B., Kinsinger, C. R., Li, J., yeast. Nature 455, 1251–1254 Mesri, M., Neubert, T. A., Niles, R. K., Pulsipher, T. C., Ransohoff, D., 31. Picotti, P., Lam, H., Campbell, D., Deutsch, E. W., Mirzaei, H., Ranish, J., Rodriguez, H., Rudnick, P. A., Smith, D., Tabb, D. L., Tegeler, T. J., Domon, B., and Aebersold, R. (2008) A database of mass spectrometric Variyath, A. M., Vega-Montoto, L. J., Wahlander, A., Waldemarson, S., assays for the yeast proteome. Nat. Methods 5, 913–914 Wang, M., Whiteaker, J. R., Zhao, L., Anderson, N. L., Fisher, S. J., 32. Ghaemmaghami, S., Huh, W. K., Bower, K., Howson, R. W., Belle, A., Liebler, D. C., Paulovich, A. G., Regnier, F. E., Tempst, P., and Carr, S. A. Dephoure, N., O’Shea, E. K., and Weissman, J. S. (2003) Global analysis (2009) Multi-site assessment of the precision and reproducibility of mul- of protein expression in yeast. Nature 425, 737–741 tiple reaction monitoring-based measurements of proteins in plasma. 33. MacLean, B., Tomazela, D. M., Shulman, N., Chambers, M., Finney, G. L., Nat. Biotechnol. 27, 633–641 Frewen, B., Kern, R., Tabb, D. L., Liebler, D. C., and MacCoss, M. J. 14. Cima, I., Schiess, R., Wild, P., Kaelin, M., Schu¨ ffler, P., Lange, V., Picotti, P., (2010) Skyline: An open source document editor for creating and ana- Ossola, R., Templeton, A., Schubert, O., Fuchs, T., Leippold, T., Wyler, lyzing targeted proteomics experiments. Bioinformatics 26, 966–968 S., Zehetner, J., Jochum, W., Buhmann, J., Cerny, T., Moch, H., Gil- 34. Picotti, P., Rinner, O., Stallmach, R., Dautel, F., Farrah, T., Domon, B., lessen, S., Aebersold, R., and Krek, W. (2011) Cancer genetics-guided Wenschuh, H., and Aebersold, R. (2010) High-throughput generation of discovery of serum biomarker signatures for diagnosis and prognosis of selected reaction-monitoring assays for proteins and proteomes. Nat. prostate cancer. Proc. Natl. Acad. Sci. U.S.A. 108, 3342–3347 Methods 7, 43–46 15. Picotti, P., Bodenmiller, B., Mueller, L. N., Domon, B., and Aebersold, R. 35. Craig, R., Cortens, J. C., Fenyo, D., and Beavis, R. C. (2006) Using anno- (2009) Full dynamic range proteome analysis of S. cerevisiae by targeted tated peptide mass spectrum libraries for protein identification. J. Pro- proteomics. Cell 138, 795–806 teome Res. 5, 1843–1849 16. Kiyonami, R., Schoen, A., Prakash, A., Peterman, S., Zabrouskov, V., 36. Frewen, B. E., Merrihew, G. E., Wu, C. C., Noble, W. S., and MacCoss, M. J. Picotti, P., Aebersold, R., Huhmer, A., and Domon, B. (2011) Increased (2006) Analysis of peptide MS/MS spectra from large-scale proteomics selectivity, analytical precision, and throughput in targeted proteomics. experiments using spectrum libraries. Anal. Chem. 78, 5678–5684 Mol. Cell. Proteomics 10, M110.002931 17. Purvine, S., Eppel, J. T., Yi, E. C., and Goodlett, D. R. (2003) Shotgun 37. Lam, H., Deutsch, E. W., Eddes, J. S., Eng, J. K., King, N., Stein, S. E., and collision-induced dissociation of peptides using a time of flight mass Aebersold, R. (2007) Development and validation of a spectral library analyzer. Proteomics 3, 847–850 searching method for peptide identification from MS/MS. Proteomics 7, 18. Venable, J. D., Dong, M. Q., Wohlschlegel, J., Dillin, A., and Yates, J. R. 655–667 (2004) Automated approach for quantitative analysis of complex peptide 38. Huang, X., Liu, M., Nold, M. J., Tian, C., Fu, K., Zheng, J., Geromanos, S. J., mixtures from tandem mass spectra. Nat. Methods 1, 39–45 and Ding, S. J. (2011) Software for quantitative proteomic analysis using stable isotope labeling and data independent acquisition. Anal. Chem. 19. Plumb, R. S., Johnson, K. A., Rainville, P., Smith, B. W., Wilson, I. D., 83, 6971–6979 Castro-Perez, J. M., and Nicholson, J. K. (2006) UPLC/MS(E): A new 39. Duncan, M. W., Yergey, A. L., and Patterson, S. D. (2009) Quantifying approach for generating molecular fragment information for biomarker structure elucidation. Rapid Commun. Mass Spectrom. 20, 1989–1994 proteins by mass spectrometry: The selectivity of SRM is only part of the 20. Panchaud, A., Scherl, A., Shaffer, S. A., von Haller, P. D., Kulasekara, H. D., problem. Proteomics 9, 1124–1127 10.1074/mcp.O111.016717–16 Molecular & Cellular Proteomics 11.6 Targeted Data Extraction for Proteome Analysis 40. Sherman, J., McKay, M. J., Ashman, K., and Molloy, M. P. (2009) How (2010) Detection and identification of 700 drugs by multi-target screening specific is my SRM?: The issue of precursor and product ion redun- with a 3200 Q TRAP LC-MS/MS system and library searching. Anal. dancy. Proteomics 9, 1120–1123 Bioanal. Chem. 396, 2425–2434 41. Schmelzer, K., Fahy, E., Subramaniam, S., and Dennis, E. A. (2007) The 47. Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., and Hirakawa, M. (2010) Lipid Maps Initiative in Lipidomics, po. 171–183, Elsevier Science Pub- KEGG for representation and analysis of molecular networks involving lishers B.V., Amsterdam diseases and drugs. Nucleic Acids Res. 38, D355–D360 42. Blanksby, S. J., and Mitchell, T. W. (2010) Advances in mass spectrometry 48. Andrews, G. L., Simons, B. L., Young, J. B., Hawkridge, A. M., and Mud- for lipidomics. Annu. Rev. Anal. Chem. 3, 433–465 diman, D. C. (2011) Performance characteristics of a new hybrid qua- 43. Smith, C. A., O’Maille, G., Want, E. J., Qin, C., Trauger, S. A., Brandon, drupole time-of-flight tandem mass spectrometer (TripleTOF 5600). Anal. T. R., Custodio, D. E., Abagyan, R., and Siuzdak, G. (2005) METLIN: A Chem. 83, 5442–5446 metabolite mass spectral database. Ther. Drug Monit. 27, 747–751 49. Wepf, A., Glatter, T., Schmidt, A., Aebersold, R., and Gstaiger, M. (2009) 44. Horai, H., Arita, M., Kanaya, S., Nihei, Y., Ikeda, T., Suwa, K., Ojima, Y., Quantitative interaction proteomics using mass spectrometry. Nat. Tanaka, K., Tanaka, S., Aoshima, K., Oda, Y., Kakazu, Y., Kusano, M., Methods 6, 203–205 Tohge, T., Matsuda, F., Sawada, Y., Hirai, M. Y., Nakanishi, H., Ikeda, K., 50. Krokhin, O. V., Craig, R., Spicer, V., Ens, W., Standing, K. G., Beavis, R. C., Akimoto, N., Maoka, T., Takahashi, H., Ara, T., Sakurai, N., Suzuki, H., and Wilkins, J. A. (2004) An improved model for prediction of retention Shibata, D., Neumann, S., Iida, T., Tanaka, K., Funatsu, K., Matsuura, F., times of tryptic peptides in ion pair reversed-phase HPLC: Its application Soga, T., Taguchi, R., Saito, K., and Nishioka, T. (2010) MassBank: A to protein peptide mapping by off-line HPLC-MALDI MS. Mol. Cell. public repository for sharing mass spectral data for life sciences. J. Mass Proteomics 3, 908–919 Spectrom. 45, 703–714 51. Rost, H. L., Malmstrom, L., and Ruedi Aebersold, R. (2012) A computational 45. Dresen, S., Gergov, M., Politi, L., Halter, C., and Weinmann, W. (2009) tool to detect and avoid redundancy in selected reaction monitoring. ESI-MS/MS library of 1,253 compounds for application in forensic and Mol. Cell. Proteomics, mcp.M111.013045. First Published on April 24, clinical toxicology. Anal. Bioanal. Chem. 395, 2521–2526 46. Dresen, S., Ferreiro´ s, N., Gnann, H., Zimmermann, R., and Weinmann, W. 2012, doi:10.1074/mcp.M111.013045 Molecular & Cellular Proteomics 11.6 10.1074/mcp.O111.016717–17

Journal

Molecular & Cellular ProteomicsAmerican Society for Biochemistry and Molecular Biology

Published: Jun 1, 2012

There are no references for this article.