2009. " Transport Functions Dominate the SAR11 Metaproteome at Low-Nutrient Extremes in the Sargasso Sea." The ISME Journal 3(1):93-105. Abstract The northwestern Sargasso Sea is part of the North Atlantic subtropical oceanic gyre that is characterized as seasonally oligotrophic with pronounced stratification in the summer and autumn. Essentially a marine desert, the biological productivity of this region is reduced during stratified periods as a result of low concentrations of phosphorous and nitrogen in the euphotic zone. To better understand the mechanisms of microbial survival in this oligotrophic environment, we used capillary LC-tandem mass spectrometry to study the composition of microbial proteomes in surface samples collected in September 2005. A total of 2279 peptides that mapped to 236 SAR11 proteins, and 3208 peptides that mapped to 404 Synechococcus proteins, were detected. Mass spectra from SAR11 periplasmic binding proteins accounted for a disproportionately large fraction of the peptides detected, consistent with observations that these extremely small cells devote a large proportion of their volume to periplasm. Abundances were highest for periplasmic substrate-binding proteins for phosphate, amino acids, phosphonate, sugars, and spermidine. Although the data showed that a large fraction of microbial protein synthesis in the Sargasso Sea is devoted to inorganic and organic nutrient acquisition, the proteomes of both SAR11 and Synechococcus also indicated that these populations were actively growing. Our findings support the view that competition for multiple nutrients in oligotrophic systems is extreme but sufficient to sustain microbial community activity.
2009. "Comparative systems biology across an evolutionary gradient within the Shewanella genus ." Proceedings of the National Academy of Sciences of the United States of America 106(37):15909-15914. doi:10.1073/pnas.0902000106 Abstract To what extent genotypic differences translate to phenotypic variation remains a poorly understood issue of paramount importance for several cornerstone concepts of microbiology such as the species definition. Here, we take advantage of the completed genomic sequences, expressed proteomic profiles, and physiological studies of ten closely related Shewanella organisms to provide quantitative insights into this issue. Our analyses revealed that, despite the extensive horizontal gene transfer characterizing these genomes, the genotypic and phenotypic similarities among the organisms were generally predictable from their evolutionary relatedness. The power of the predictions depended, however, on the degree of ecological specialization of the organisms evaluated. Using the unprecedented genetic gradient formed by these genomes, we were able to isolate the effect of ecology from the effect of evolutionary divergence and rank the different cellular functions in terms of their rates of evolution. Our ranking also revealed that whole-cell protein expression differences among these organisms when grown under identical conditions were relatively larger than differences at the genome level, suggesting that similarity in gene regulation and expression should constitute another important parameter for (new) species description. Collectively, our results provide important new information towards beginning a system level understanding of bacterial species and genera.
2008. "A Support Vector Machine model for the prediction of proteotypic peptides for accurate mass and time proteomics." Bioinformatics 24(13):1503-1509. doi:10.1093/bioinformatics/btn218 Abstract Motivation: The standard approach to identifying peptides based on accurate mass and elution time (AMT) compares these profiles obtained from a high resolution mass spectrometer to a database of peptides previously identified from tandem mass spectrometry (MS/MS) studies. It would be advantageous, with respect to both accuracy and cost, to only search for those peptides that are detectable by MS (proteotypic). Results: We present a Support Vector Machine (SVM) model that uses a simple descriptor space based on 35 properties of amino acid content, charge, hydrophilicity, and polarity for the quantitative prediction of proteotypic peptides. Using three independently derived AMT databases (Shewanella oneidensis, Salmonella typhimurium, Yersinia pestis) for training and validation within and across species, the SVM resulted in an average accuracy measure of ~0.8 with a standard deviation of less than 0.025. Furthermore, we demonstrate that these results are achievable with a small set of 12 variables and can achieve high proteome coverage. Availability: http://omics.pnl.gov/software/STEPP.php
2008. "Proteomic Analysis of Stationary Phase in the Marine Bacterium 'Candidatus Pelagibacter ubique'." Applied and Environmental Microbiology 74(13):4091-4100. doi:10.1128/AEM.00599-08 Abstract Candidatus Pelagibacter ubique, an abundant marine alphaproteobacterium, subsists in nature at low ambient nutrient concentrations and may often be exposed to nutrient limitation, but its genome revealed no evidence of global regulatory adaptations to stationary phase. We used high-resolution capillary liquid chromatography (LC) coupled online to an LTQ mass spectrometer to build an Accurate Mass and Time (AMT) tag library, and employed the AMT tag approach to quantitatively examine proteome differences between exponentially growing and stationary phase Cand. P. ubique cells cultivated in a seawater medium. The AMT tag library represented 72% of the predicted protein coding genes. Stationary phase protein abundance increased for OsmC, which mitigates oxidative damage, and for molecular chaperones, enzymes involved in methionine and cysteine biosynthesis, proteins involved in rho-dependent transcription termination, and the signal transduction enzymes CheY-FisH and ChvG. Our findings indicate that Cand. P. ubique responds adaptively to stationary phase by increasing the abundance of a suite of proteins that contribute to homeostasis, but does not undergo major proteome remodeling. We speculate that this limited response may enable Cand. P. ubique to cope with ambient conditions in which nutrients are often insufficient for short periods, and the ability to resume growth overrides the capacity for long term survival afforded by more comprehensive global stationary phase responses.
2008. "Identification of Mobile Elements and Pseudogenes in the Shewanella oneidensis MR-1 Genome." Applied and Environmental Microbiology 74(10):3257-3265. Abstract Shewanella oneidensis MR-1 is the first of 22 different Shewanella spp. whose genomes have been or are being sequenced and thus serves as the model organism for studying the functional repertoire of the Shewanella genus. The original MR-1 genome annotation revealed a large number of transposase genes and pseudogenes, indicating that many of the genome’s functions may be decaying. Comparative analyses of the sequenced Shewanella strains suggest that 209 genes in MR-1 have in-frame stop codons, frameshifts, or interruptions and/or are truncated and that 65 of the original pseudogene predictions were erroneous. Among the decaying functions are that of one of three chemotaxis clusters, type I pilus production, starch utilization, and nitrite respiration. Many of the mutations could be attributed to members of 41 different types of insertion sequence (IS) elements and three types of miniature inverted-repeat transposable elements identified here for the first time. The high copy numbers of individual mobile elements (up to 71) are expected to promote large-scale genome recombination events, as evidenced by the displacement of the algA promoter. The ability of MR-1 to acquire foreign genes via reactions catalyzed by both the integron integrase and the ISSod25-encoded integrases is suggested by the presence of attC sites and genes whose sequences are characteristic of other species downstream of each site. This large number of mobile elements and multiple potential sites for integrasemediated acquisition of foreign DNA indicate that the MR-1 genome is exceptionally dynamic, with many functions and regulatory control points in the process of decay or reinvention.
2008. "Comparative proteogenomics: combining mass spectrometry and comparative genomics to analyze multiple genomes." Genome Research 18(7):1133-1142. Abstract While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. At the same time, the number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes, let alone annotate bacterial proteomes. In this study, we use tandem mass spectrometry (MS/MS) to annotate the proteome of Shewanella oneidensis MR-1, an important microbe for bioremediation. In particular, we provide the first comprehensive map of post-translational modifications in a bacterial genome, including a large number of chemical modifications, signal peptide cleavages and cleavage of N-terminal methionine residues. We also detect multiple genes that were missed or assigned incorrect start positions by gene prediction programs and suggest corrections to improve the gene annotation. This study demonstrates that complementing every genome sequencing project by an MS/MS project would significantly improve both genome and proteome annotations for a reasonable cost.
2008. "Role of the global transcriptional regulator PrrA in Rhodobacter sphaeroides 2.4.1: combined transcriptome and proteome analysis." Journal of Bacteriology 190(14):4831-4848. doi:10.1128/JB.00301-08 Abstract The PrrBA two-component regulatory system is a major global regulator in Rhodobacter sphaeroides 2.4.1. In this study we have compared the transcriptome and proteome profiles of the wild type (WT) and mutant PrrA2 cells grown anaerobically, in the dark, with DMSO as electron acceptor. Approximately 25% of the genes present in the genome are PrrA-regulated, at the transcriptional level, either directly or indirectly, by ≥ 2-fold relative to wild type. The genes affected are widespread throughout all COG functional categories, with previously unsuspected “metabolic” genes affected when in the PrrA mutant background. PrrA was found to act both as an activator and a repressor of transcription, with more genes being repressed in the presence of PrrA (9:5 ratio). An analysis of the genes encoding the 1,536 peptides detected through our chromatographic study, which corresponds to 36% coverage of the genome, revealed that approximately 20% of the genes encoding these proteins were positively regulated, whereas approximately 32% were negatively regulated by PrrA, which is in excellent agreement with the percentages obtained for the whole genomic transcriptome profile. In addition, comparison of the transcriptome and proteome mean parameter values chosen between WT and PrrA2 showed good qualitative agreement, indicating that transcript regulation paralleled the corresponding protein abundance, although not one for one. The microarray analysis was validated by direct mRNA measurement of randomly selected, both positively and negatively regulated genes. lacZ transcriptional and kan translational fusions enabled us to map putative PrrA binding sites, as well as revealing potential gene targets for indirect regulation by PrrA.
2008. "The Influence of Cultivation Methods on Shewanella oneidensis Physiology and Proteome Expression ." Archives of Microbiology 189(4):313-324. doi:10.1007/s00203-007-0321-y Abstract High-throughput analyses that are central to microbial systems biology and ecophysiology research benefit from highly homogeneous and physiologically well-defined cell cultures. While attention has focused on the technical variation associated with high-throughput technologies, biological variation introduced as a function of cell cultivation methods has been overlooked. This study evaluated the impact of cultivation methods, controlled batch or continuous culture in bioreactors versus shake flasks, on the reproducibility of global proteome measurements in Shewanella oneidensis MR-1. Variability in dissolved oxygen concentration and consumption rate, metabolite profiles, and proteome was greater in shake flask than controlled batch or chemostat cultures. Proteins indicative of suboxic and anaerobic growth (e.g., fumarate reductase and decaheme c-type cytochromes) were more abundant in cells from shake flasks compared to bioreactor cultures, a finding consistent with data demonstrating that “aerobic” flask cultures were O2 deficient due to poor mass transfer kinetics. The work described herein establishes the necessity of controlled cultivation for ensuring highly reproducible and homogenous microbial cultures. By decreasing cell to cell metabolic variability, higher quality samples will allow for the interpretive accuracy necessary for drawing conclusions relevant to microbial systems biology research.
2008. "A Computational Strategy to Analyze Label-Free Temporal Bottom-up Proteomics Data." Journal of Proteome Research 7(7):2595-2604. doi:10.1021/pr0704837 Abstract Motivation: Biological systems are in a continual state of flux, which necessitates an understanding of the dynamic nature of protein abundances. The study of protein abundance dynamics has become feasible with recent improvements in mass spectrometry-based quantitative proteomics. However, a number of challenges still re-main related to how best to extract biological information from dy-namic proteomics data; for example, challenges related to extrane-ous variability, missing abundance values, and the identification of significant temporal patterns. Results: This article describes a strategy that addresses the afore-mentioned issues for the analysis of temporal bottom-up proteomics data. The core strategy for the data analysis algorithms and subse-quent data interpretation was formulated to take advantage of the temporal properties of the data. The analysis procedure presented herein was applied to data from a Rhodobacter sphaeroides 2.4.1 time-course study. The results were in close agreement with existing knowledge about R. sphaeroides, therefore demonstrating the utility of this analytical strategy.
2008. "Proteome of Geobacter sulfurreducens grown with Fe(III) oxide or Fe(III) citrate as the electron acceptor." Biochimica et Biophysica Acta--Proteins and Proteomics 1784(12):1935-1941. doi:10.1016/j.bbapap.2008.06.011 Abstract e(III) oxides are the most abundant source of reducible Fe(III) by microorganisms in most soils and sediments, yet few studies on the physiology of Fe(III)-reducing microorganisms during growth on Fe(III) oxide have been conducted because of the technical difficulties in working with cell growth and harvest in the presence of Fe(III) oxides. Geobacter sulfurreducens is a representative of the Geobacter species that predominate in a variety of subsurface environments in which Fe(III) oxide is important. In order to better understand the physiology of Geobacter species during growth on Fe(III) oxide, the proteome of G. sulfurreducens grown on Fe(III) oxide was compared with the proteome of cells grown with soluble Fe(III) citrate. Two-dimensional polyacrylamide gel electrophoresis (2-D PAGE) revealed 19 proteins that were more abundant during growth on Fe(III) oxide than on soluble Fe(III). These included proteins related to protein synthesis, electron transfer and energy production, oxidative stress, protein folding, outer membrane proteins, nitrogen metabolism and hypothetical proteins. Further analysis of the proteome with the accurate mass and time (AMT) tag method revealed additional proteins associated with growth on Fe(III) oxide. These included the outer-membrane c-type cytochrome, OmcS and OmcG, which genetic studies have suggested are required for Fe(III) oxide reduction. Furthermore, several other cytochromes, as yet unstudied, were detected to be significantly up regulated during growth on Fe(III) oxide and other proteins of unknown function were more abundant during growth on Fe(III) oxide than on soluble Fe(III). PilA, the structural protein for pili, which is required for Fe(III) oxide reduction, and other pilin-associated proteins were also more abundant during growth on Fe(III) oxide. Confirmation of the differential expression of proteins known to be important in Fe(III) oxide reduction was observed, and an additional number of previously unidentified proteins were found with significant abundance in the cells grown under conditions of Fe(III) oxide reduction.
2008. "Comparative Bacterial Proteomics: Analysis of the Core Genome Concept." PLoS One 3(2):e1542. Abstract Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.
2008. "Proteogenomics: the needs and roles to be filled by proteomics in genome annotation." Briefings in Functional Genomics and Proteomics 7(1):50-62. Abstract While genome sequencing efforts reveal the basic building blocks of life, a genome sequence alone is insufficient for elucidating biological function. Genome annotation – the process of identifying genes and assigning function to each gene in a genome sequence – provides the means to elucidate biological function from sequence. Current state-of-the-art high throughput genome annotation uses a combination of comparative (sequence similarity data) and non-comparative (ab initio gene prediction algorithms) methods to identify open reading frames in genome sequences. Because approaches used to validate the presence of these open reading frames are typically based on the information derived from the annotated genomes, they cannot independently and unequivocally determine whether a predicted open reading frame is translated into a protein. With the ability to directly measure peptides arising from expressed proteins, high throughput liquid chromatography-tandem mass spectrometry-based proteomics, approaches can be used to verify coding regions of a genomic sequence. Here, we highlight several ways in which high throughput tandem mass spectrometry-based proteomics can improve the quality of genome annotations and suggest that it could be efficiently applied during the initial gene calling process so that the improvements are propagated through the subsequent functional annotation process.
2007. "Proteomic characterization of the Rhodobacter sphaeroides 2.4.1 photosynthetic membrane: Identification of New Proteins." Journal of Bacteriology 189(20):7464-7474. doi:10.1128/JB.00946-07 Abstract The intracytoplasmic membrane (ICM) system develops, upon induction, as a structure dedicated to the major events of bacterial photosynthesis, including harvesting light energy, primary charge separation, and electron transport. In this study, multi-chromatographic methods coupled with fourier transform ion cyclotron resonance (FTICR) mass spectrometer, combined with subcellular fractionation, was applied to an investigation of the supramolecular composition of the native photosynthetic membrane of Rhodobacter sphaeroides 2.4.1. A complete proteomic profile of the intracytoplasmic membranes was obtained and the results showed that the intracytoplasmic membranes are mainly composed of four photosynthetic membrane protein complexes, including light harvesting complexes I and II, the reaction center and cytochrome bc1, as well as two new membrane protein components, an unknown protein (RSP1760) and a possible alkane hydroxylase. Proteins necessary for various cellular functions, such as ATP synthesis, respiratory components, ABC transporters, protein translocation, and other proteins with unknown functions were also identified in association with the intracytoplasmic membranes. This study opens a new perspective on the characterization and understanding of the photosynthetic supramolecular complexes of R. sphaeroides, and their internal interactions as well as interactions with other proteins inside or outside the intracytoplasmic membranes.
2007. "Applying a Targeted Label-free Approach using LC-MS AMT Tags to Evaluate Changes in Protein Phosphorylation Following Phosphatase Inhibition." Journal of Proteome Research 6(11):4489-4497. doi:10.1021/pr070068e Abstract To identify phosphoproteins regulated by the phosphoprotein phosphatase (PPP) family of S/T phosphatases, we performed a large-scale characterization of changes in protein phosphorylation on extracts from HeLa cells treated with or without calyculin A, a potent PPP enzyme inhibitor. A label-free comparative Phosphoproteomics approach using immobilized metal ion affinity chromatography and targeted tandem mass spectrometry was employed to discover and identify signatures based upon distinctive changes in abundance. Overall, 232 proteins were identified as either direct or indirect targets for PPP enzyme regulation. Most of the present identifications represent novel PPP enzyme targets at the level of both phosphorylation site and protein. These include phosphorylation sites within signaling proteins such as p120 Catenin, A Kinase Anchoring Protein 8, JunB, and Type II Phosphatidyl Inositol 4 Kinase. These data can be used to define underlying signaling pathways and events regulated by the PPP family of S/T phosphatases.
2007. "Protein Composition of the Vaccinia Virus Mature Virion." Virology 358(1):233-247. Abstract The protein content of vaccinia virus mature virions, purified by rate zonal and isopycnic centrifugation and solubilized by SDS or a solution of urea and thiourea, was determined by the accurate mass and time tag technology which uses both tandem mass spectrometry and Fourier transform-ion cyclotron resonance mass spectrometry to detect tryptic peptides separated by high-resolution liquid chromatography. Eighty vaccinia virus-encoded proteins representing 37% of the 218 genes annotated in the complete genome sequence were detected in at least three analyses. Ten proteins accounted for approximately 80% of the mass, while the least abundant proteins made up 1% or less of the mass. Thirteen identified proteins were not previously reported as components of virions. On the other hand, 8 previously described virion proteins were not detected here, presumably due to technical reasons including small size and hydrophobicity. In addition to vaccinia virus-encoded proteins, 24 host proteins omitting isoforms were detected. The most abundant of these were cytoskeletal proteins, heat shock proteins, and proteins involved in translation.
2007. "Whole proteome analysis of post-translational modifications: applications of mass-spectrometry for proteogenomic annotation." Genome Research 17(9):1362-1377. doi:0.1101/gr.6427907 Abstract While bacterial genome annotations have significantly improved in recent years, techniques for bacterial proteome annotation (including post-translational chemical modifications, signal peptides, proteolytic events, etc.) are still in their infancy. At the same time, the number of sequenced bacterial genomes is rising sharply, far outpacing our ability to validate the predicted genes, let alone annotate bacterial proteomes. In this study, we use tandem mass spectrometry (MS/MS) to annotate the proteome of Shewanella oneidensis MR-1, an important microbe for bioremediation. In particular, we provide the first comprehensive map of post-translational modifications in a bacterial genome, including a large number of chemical modifications, signal peptide cleavages and cleavage of N-terminal methionine residues. We also detect multiple genes that were missed or assigned incorrect start positions by gene prediction programs and suggest corrections to improve the gene annotation. This study demonstrates that complementing every genome sequencing project by an MS/MS project would significantly improve both genome and proteome annotations for a reasonable cost.
2007. "Enrichment of Functional Redox Reactive Proteins and Identification by Mass Spectrometry Results in Several Terminal Fe(III)-reducing Candidate Proteins in Shewanella oneidensis MR-1." Journal of Microbiological Methods 68(2):367-375. Abstract Identification of the proteins directly involved in microbial metal-reduction is important to understanding the biochemistry involved in heavy metal reduction/immobilization and the ultimate cleanup of DOE contaminated sites. Although previous strategies for the identification of these proteins have traditionally required laborious protein purification/characterization of metal-reducing capability, activity is often lost before the final purification step, thus creating a significant knowledge gap. In the current study, subcellular fractions of S. oneidensis MR-1 were enriched for Fe(III)-NTA reducing proteins in a single step using several orthogonal column matrices. The protein content of eluted fractions that demonstrated activity were determined by ultra high pressure liquid chromatography coupled with tandem mass spectrometry (LCMS/ MS). A comparison of the proteins identified from active fractions in all separations produced 30 proteins that may act as the terminal electron-accepting protein for Fe(III)-reduction. These include MtrA, MtrB, MtrC and OmcA as well as a number of other proteins not previously associated with Fe(III)-reduction. This is the first report of such an approach where the laborious procedures for protein purification are not required for identification of metal-reducing proteins. Such work provides the basis for a similar approach with other cultured organisms as well as analysis of sediment and groundwater samples from biostimulation efforts at contaminated sites.
2006. "AMT Tag Approach to Proteomic Characterization of Deinococcus Radiodurans and Shewanella Oneidensis ." In Microbial Proteomics, Methods of Biochemical Analysis, vol. 49, ed. I. Humphery-Smith and M. Hecker, pp. 113-134. John Wiley & Sons, Inc., Hoboken, NJ. Abstract Biology is transitioning from a largely qualitative, mostly descriptive science to a quantitative and ultimately predictive science. Advances in high throughput DNA sequencing have made increasing numbers of genome sequences available and enabled a “systems” level analysis of complex biological organisms. The ability to quantitatively measure the array of proteins, also termed the proteome, in prokaryotic cells and communities of cells is key to understanding microbial systems. This chapter focuses on the utility of the AMT tag mass spectrometric approach used to characterize the proteomes of two microbes, Deinococcus radiodurans and Shewanella oneidensis MR-1.
2006. "Phosphoproteome Profiling of Human Skin Fibroplast Cells in Response to Low- and High-Dose Irradiation ." Journal of Proteome Research 5(5):1252-1260. Abstract The biological effect of low-dose radiation is currently not well understood. A hallmark of the response to radiation is the phosphorylation of proteins involved in DNA repair, DNA damage signaling, and cell cycle checkpoint control, which is important in prompt cellular response. The objective of the work presented here was to explore the phosphoproteome of normal human skin fibroblast (HSF) cells to reveal differences between low- and high-dose irradiation responses at the protein phosphorylation level. Several techniques —Trizol extract of proteins, methylation of the enzyme digest (peptides), enrichment of phosphopeptides with immobilized metal affinity chromatography (IMAC), nanoflow reversed-phase HPLC (nano-LC)/electrospray ionization, and tandem mass spectrometry— were combined for analysis of the HSF cell phosphoproteome following low- and high-doses of irradiation. More than 95% of the peptides identified after IMAC enrichment were phosphopeptides. Among the 493 unique phosphopeptides, 232 were singly phosphorylated, 220 were doubly phosphorylated, and 41 were triply phosphorylated, indicating the overall effectiveness of the IMAC technique to enrich both singly and multiple phosphorylated peptides. Over 700 phosphorylation sites were assigned to a total of 346 proteins, many of which are known or proposed to be highly relevant to a plethora of fundamental biological processes. The profile for proteins identified from the low-dose (2cGy) irradiated HSF cells was shown to be different from the profile obtained for proteins irradiated at the high-dose (4 Gy). This type of fundamental information regarding radiation-response to cellular events at the molecular level provides a mechanistic basis for identifying relevant molecular markers that can be used in future to better evaluate human health risks at low doses of irradiation and to develop low dose radiation counter measurements.
2006. "Using size exclusion chromatography-RPLC and RPLC-CIEF as two-dimensional separation strategies for protein profiling ." Electrophoresis 27(13):2722-2733. doi:10.1002/elps.200600037 Abstract Bottom-up proteomics (analyzing peptides that result from protein digestion) has demonstrated capability for broad proteome coverage and good throughput. However, due to incomplete sequence coverage, this approach is not ideally suited to the study of modified proteins. The modification complement of a protein can best be elucidated by analyzing the intact protein. Two-dimensional gel electrophoresis, typically coupled with the analysis of peptides that result from in-gel digestion, is the most frequently applied protein separation technique in MS-based proteomics. As an alternative, numerous column-based liquid phase techniques, which are generally more amenable to automation, are being investigated. In this work, the combination of size exclusion chromatography (SEC) fractionation with reversed-phase liquid chromatography (RPLC)-Fourier-transform ion cyclotron resonance (FTICR)-mass spectrometry (MS) is compared with the combination of RPLC fractionation with capillary isoelectric focusing (CIEF)-FTICR-MS for the analysis of the Shewanella oneidensis proteome. SEC-RPLC-FTICR-MS allowed the detection of 297 proteins, as opposed to 166 using RPLC-CIEF-FTICR-MS, indicating that approaches based on LC-MS provide better coverage. However, there were significant differences in the sets of proteins detected and both approaches provide a basis for accurately quantifying changes in protein and modified protein abundances.
2006. "Improved peptide elution time prediction for reversed-phase liquid chromatography-MS by incorporating peptide sequence information." Analytical Chemistry 78(14):5026-5039. doi:10.1021/ac060143p Abstract We describe an improved artificial neural network (ANN)-based method for predicting peptide retention times in reversed phase liquid chromatography. In addition to the peptide amino acid composition, this study investigated several other peptide descriptors to improve the predictive capability, such as peptide length, sequence, hydrophobicity and hydrophobic moment, and nearest neighbor amino acid, as well as peptide predicted structural configurations (i.e., helix, sheet, coil). An ANN architecture that consisted of 1052 input nodes, 24 hidden nodes, and 1 output node was used to fully consider the amino acid residue sequence in each peptide. The network was trained using ~345,000 non-redundant peptides identified from a total of 12,059 LC-MS/MS analyses of more than 20 different organisms, and the predictive capability of the model was tested using 1303 confidently identified peptides that were not included in the training set. The model demonstrated an average elution time precision of ~1.5% and was able to distinguish among isomeric peptides based upon the inclusion of peptide sequence information. The prediction power represents a significant improvement over our earlier report (Petritis et al., Anal. Chem. 2003, 75, 1039-1048) and other previously reported models.
2006. "Proteomic approaches to bacterial differentiation ." Journal of Microbiological Methods 67(3):473-486. doi:10.1016/j.mimet.2006.04.024 Abstract While genomic approaches have been applied to the detection and identification of individual bacteria within microbial communities, analogous proteomics approaches have been effectively precluded due to the inherent complexity. An in silico assessment of peptides derived from artificial simple and complex communities was performed to evaluate the effect of proteome complexity on species detection. Detection and validation of predicted peptides initially identified as distinctive within the simple community was experimentally performed using a mass spectrometry-based proteomics approach. An assessment of peptide distinctiveness and the potential for mapping to a particular bacterium within a community was made throughout each step of the study. A second assessment performed in silico of peptide distinctiveness for a complex community of 25 microorganisms was also conducted. The experimental data for a simple community, and the in silico data for a complex community revealed that it is feasible to predict, observe, and quantify distinctive peptides from one organism in the presence of at least a 100-fold greater abundance of another, thus yielding putative markers for the identification of a bacterium of interest. This work represents a first step towards quantitative proteomic characterization of complex microbial communities.
2006. "Proteomic approaches to bacterial differentiation." Journal of Microbiological Methods 67(3):473-486. Abstract While genomic approaches have been applied for the detection and identification of individual bacteria within microbial communities, analogous proteomics approaches have been effectively precluded due to their inherent complexity. An in silico assessment of peptides that could potentially be present in the proteomes of artificial simple and complex communities was performed to evaluate the effect of proteome complexity on species detection. A mass spectrometry-based proteomics approach was employed to experimentally detect and validate the predicted tryptic peptides initially identified as distinctive within the simple community. An assessment of peptide distinctiveness and the potential for mapping to a particular bacterium within a community was made throughout each step of the study. A second in silico assessment of peptide distinctiveness for a complex community of 25 microorganisms was conducted to investigate the levels of instrumental performance that would be required to experimentally detect these peptides, as well as how performance varied with complexity (e.g., the number of different microorganisms). The experimental data for a simple community showed that it is feasible to predict, observe, and to quantify distinctive peptides from one organism in the presence of at least a 100-fold greater abundance of another, thus yielding putative markers for identifying a bacterium of interest. This work represents a first step towards quantitative proteomic characterization of complex microbial communities and the possible development of community wide markers of perturbations to such communities.
2006. "Biomarker Candidate Identification in Yersinia Pestis Using Organism-Wide Semiquantitative Proteomics ." Journal of Proteome Research 5(11):3008-3017. Abstract Yersinia pestis, the causative agent of plague, is listed by the CDC as a level A select pathogen. To better enable detection, intervention and treatment of Y. pestis infections, it is necessary to understand its protein expression under conditions that promote or inhibit virulence. To this end, we have utilized a novel combination of the accurate mass and time tag methodology of mass spectrometry and clustering analysis using OmniViz™ to compare the protein abundance changes of 992 identified proteins under four growth conditions. Temperature and Ca2+ concentration were used to trigger virulence associated protein expression fundamental to the low calcium response. High-resolution liquid chromatography and electrospray ionization mass spectrometry were utilized to determine protein identity and abundance on the genome-wide level. The cluster analyses revealed, in a rapid visual platform, the reproducibility of the current method as well as relevant protein abundance changes of expected and novel proteins relating to a specific growth condition and sub-cellular location. Using this method, 89 proteins were identified as having a similar abundance change profile to 29 known virulence associated proteins, providing additional biomarker candidates for future detection and vaccine development strategies.
2006. "Differential Label-free Quantitative Proteomic Analysis of Shewanella oneidensis Cultured under Aerobic and Suboxic Conditions by Accurate Mass and Time Tag Approach." Molecular & Cellular Proteomics. MCP 5(4):714-725. doi:10.1074/mcp.M500301-MCP200 Abstract We describe the application of liquid chromatography coupled to mass spectrometry (LC/MS) without the use of stable isotope labeling for differential quantitative proteomics analysis of whole cell lysates of Shewanella oneidensis MR-1 cultured under aerobic and sub-oxic conditions. Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) was used to initially identify peptide sequences, and LC coupled to Fourier transform ion cyclotron resonance mass spectrometry (LC-FTICR) was used to confirm these identifications, as well as measure relative peptide abundances. 2343 peptides, covering 668 proteins were identified with high confidence and quantified. Among these proteins, a subset of 56 changed significantly using statistical approaches such as SAM, while another subset of 56 that were annotated as performing housekeeping functions remained essentially unchanged in relative abundance. Numerous proteins involved in anaerobic energy metabolism exhibited up to a 10-fold increase in relative abundance when S. oneidensis is transitioned from aerobic to sub-oxic conditions.
2006. "Confirmation of the Expression of a Large Set of Conserved Hypothetical Proteins in Shewanella oneidensis MR-1." Journal of Microbiological Methods 66(2):223-233. Abstract High-throughput “omic” technologies have allowed for a relatively rapid, yet comprehensive analysis of the global expression patterns within an organism in response to perturbations. In the current study, tryptic peptides were identified with high confidence from capillary liquid chromatography-mass spectrometry analysis of 26 chemostat cultures of Shewanella oneidensis MR-1 under various conditions. Using at least one distinctive and a total of two total peptide identifications per protein, we detected the expression of 758 conserved hypothetical proteins. This included 359 such proteins previously described (Kolker et al, 2005) with an additional 399 reported herein for the first time. The latter 399 proteins ranged from 5.3 to 208.3 kDa, with 44 being of 100 amino acid residues or less. Using a combination of information including peptide detection in cells grown under specific culture conditions and predictive algorithms such as PSORT and PSORT-B, possible/plausible functions are proposed for some conserved hypothetical proteins. Such proteins were found not only to be expressed, but 19 were only expressed under certain culturing conditions, thereby providing insight into potential functions. These findings also impact the genomic annotation for S. oneidensis MR-1 by confirming that these genes code for expressed proteins. Our results indicate that 399 proteins can now be upgraded from “conserved hypothetical protein” to “expressed protein in Shewanella,” 19 of which appeared to be expressed under specific culture conditions.
2006. "The Proteome of Dissimilatory Metal-reducing Microorganism Geobacter Sulfurreducens under Various Growth Conditions." Biochimica et Biophysica Acta--Proteins and Proteomics 1764(7):1198-1206. doi:10.1016/j.bbapap.2006.04.017 Abstract The global protein analysis of Geobacter sulfurreducens, a model for the Geobacter species that predominate in many Fe(III)-reducing subsurface environments, was characterized with ultra high pressure liquid chromatography and mass spectrometry using accurate mass and time (AMT) tags as well as with more traditional two-dimensional polyacrylamide gel electrophoresis (2-D PAGE). Cells were grown under eight different growth conditions in order to enhance the potential that genes would be expressed. Over 3,187 gene products, representing about 92% of the total predicted gene products in the genome, were detected. The AMT approach was able to identify a much higher number of proteins than could be detected with the 2-D PAGE approach. A high proportion of predicted proteins in most protein role categories were detected with the highest number of proteins identified in the hypothetical protein role category. Furthermore, 91 c-type cytochromes of 111 predicted genes in the G. sulfurreducens genome were identified. Localization studies indicated that computational predictions of cytochrome location were limited. Differences in the abundance of cytochromes and other proteins under different growth conditions provided information for future functional analysis of these proteins. These results demonstrate that a high percentage of the predicted proteins in the G. sulfurreducens genome are produced and that the AMT approach provides a rapid method for comparing differential expression of proteins under different growth conditions in this organism.
2006. "Normalization Approaches for Removing Systematic Biases Associated with Mass Spectrometry and Label-Free Proteomics." Journal of Proteome Research 5(2):277-286. doi:10.1021/pr050300l Abstract Central tendency, linear regression, locally weighted regression, and quantile techniques were investigated for normalization of peptide abundance measurements obtained from high-throughput liquid chromatography-Fourier transform ion cyclotron resonance mass spectrometry (LC-FTICR MS). Arbitrary abundances of peptides were obtained from three sample sets, including a standard protein sample, two Deinococcus radiodurans samples taken from different growth phases, and two mouse striatum samples from control and methamphetamine-stressed mice (strain C57BL/6). The selected normalization techniques were evaluated in both the absence and presence of biological variability by estimating extraneous variability prior to and following normalization. Prior to normalization, replicate runs from each sample set were observed to be statistically different, while following normalization replicate runs were no longer statistically different. Although all techniques reduced systematic bias, assigned ranks among the techniques revealed significant trends. For most LC-FTICR MS analyses, linear regression normalization ranked either first or second among the four techniques, suggesting that this technique was more generally suitable for reducing systematic biases.
2006. "Comparison of aerobic and photosynthetic Rhodobacter sphaeroides 2.4.1 proteomes." Journal of Microbiological Methods 67(3):424-436. doi:10.1016/j.mimet.2006.04.021 Abstract Proteomes from aerobic and photosynthetic grown Rhodobacter sphaeroides 2.4.1 cell cultures were characterized using liquid chromatography-mass spectrometry in conjunction with an accurate mass and elution time (AMT) tag approach. Roughly 8000 high quality peptides were detected that represented 1,445 gene products and 34% of the predicted proteins. The identified proteins corresponded primarily to open reading frames (ORFs) contained within the two chromosomal elements of this bacterium, but a significant number were also observed from ORFs associated with 5 naturally occurring plasmids. Data mining of peptides revealed a number of proteins uniquely detected within the photosynthetic cell culture. Proteins observed in both aerobic respiratory and photosynthetic grown cultures were analyzed semi-quantitatively by comparing their estimated abundances to provide insights into bioenergetic models for aerobic respiration and photosynthesis. Additional emphasis was placed on gene products annotated as hypothetical to gain information as to their potential roles within these two growth conditions. Where possible, transcriptome data for R. sphaeroides obtained under the same culture conditions were compared with these results. This comparative study demonstrated the applicability of the AMT tag approach for high-throughput proteomic analyses of pathways associated with the photosynthetic lifestyle.
2006. "Application of the Accurate Mass and Time Tag Approach to the Proteome Analysis of Sub-cellular Fractions Obtained from Rhodobacter sphaeroides 2.4.1 Aerobic and Photosynthetic Cell Cultures." Journal of Proteome Research 5(8):1940-1947. Abstract Abstract The high-throughput accurate mass and time tag (AMT) proteomic approach was utilized to characterize the proteomes for cytoplasm, cytoplasmic membrane, periplasm, and outer membrane fractions from aerobic and photosynthetic cultures of the gram-nagtive bacterium Rhodobacter sphaeroides 2.4.1. In addition, we analyzed the proteins within purified chromatophore fractions that house the photosynthetic apparatus from photosynthetically grown cells. In total, 8300 peptides were identified with high confidence from at least one sub-cellular fraction from either cell culture. These peptides were derived from 1514 genes or 35% percent of proteins predicted to be encoded by the genome. A significant number of these proteins were detected within a single sub-cellular fraction and their localization was compared to in-silico predictions. However, the majority of proteins were observed in multiple sub-cellular fractions, and the most likely sub-cellular localization for these proteins was investigated using a Z-score analysis of peptide abundance along with clustering techniques. Good (81%) agreement was observed between the experimental results and in-silico predictions. The AMT tag approach provides localization evidence for those proteins that have no predicted localization information, those annotated as putative proteins, and/or for those proteins annotated as hypothetical and conserved hypothetical.
2005. "Characterization of purified c-type heme-containing peptides and identification of c-type heme-attachment sites in Shewanella oneidenis cytochromes using mass spectrometry." Journal of Proteome Research 4(3):846-854. doi:10.1021/pr0497475 Abstract We describe methods for mass spectrometric identification of heme-containing peptides from digests of c-type cytochromes that contain the CXXCH(X = any amino acid) sequence motif. Analysis of purified standard heme-containing peptides showed that the charged heme group was present both before and after peptide fragmentation in the gas phase. The heme fragment ion yielded the most abundant MS/MS peak for standard heme-containing peptides with one amino acid difference (DAA=1) for both 2+ and 3+ peptide charge states and the extent of heme loss during peptide fragmentation was affected by both sequence and charge. A modified search strategy was evaluated with tryptic digests of one known and two unknown cytochromes from Shewanella oneidenis, demonstrating that this approach can be generally applied for identification of c-type heme-containing peptides from complex samples.
2005. "Global whole-cell FTICR mass spectrometric proteomics analysis of the heat shock response in the radioresistant bacterium Deinococcus radiodurans." Journal of Proteome Research 4(3):709-718. Abstract Despite intense interest in the response to radiation in D. radiodurans, little is known about how the organism responds to other stress factors. Our previous studies indicated that D. radiodurans mounts a regulated protective response to heat shock, and that expression of the groESL and dnaKJ operons are induced in response to elevated temperature. In order to gain greater insight into the heat shock response of D. radiodurans on a more global scale, we undertook the study reported here. Using whole-cell semiquantitative mass spectrometric proteomics integrated with global transcriptome microarray analyses, we have determined a core set of highly induced heat shock genes whose expression correlates well at the transcriptional and translational levels. In addition, we observed that the higher the absolute expression of a given gene at physiological conditions, the better the quantitative correlation between RNA and protein expression levels.
2005. "Evaluation of two-dimensional electrophoresis and liquid chromatography – tandem mass spectrometry for tissue-specific protein profiling of laser-microdissected plant samples." Electrophoresis 26(14):2729-2738. Abstract Laser microdissection (LM) allows the collection of homogeneous tissue- and cellspecific plant samples. The employment of this technique with subsequent protein analysis has thus far not been reported for plant tissues, probably due to the difficulties associated with defining a reasonable cellular morphology and, in parallel, allowing efficient protein extraction from tissue samples. The relatively large sample amount needed for successful proteome analysis is an additional issue that complicates protein profiling on a tissue- or even cell-specific level. In contrast to transcript profiling that can be performed from very small sample amounts due to efficient amplification strategies, there is as yet no amplification procedure for proteins available. In the current study, we compared different tissue preparation techniques prior to LM/laser pressure catapulting (LMPC) with respect to their suitability for protein retrieval. Cryosectioning was identified as the best compromise between tissue morphology and effective protein extraction. After collection of vascular bundles from Arabidopsis thaliana stem tissue by LMPC, proteins were extracted and subjected to protein analysis, either by classical two-dimensional gel electrophoresis (2-DE), or by high-efficiency liquid chromatography (LC) in conjunction with tandem mass spectrometry (MS/MS). Our results demonstrate that both methods can be used with LMPC collected plant material. But because of the significantly lower sample amount required for LC-MS/MS than for 2-DE, the combination of LMPC and LC-MS/MS has a higher potential to promote comprehensive proteome analysis of specific plant tissues.
2005. "Targeted Comparative Proteomics by Liquid Chromatography - Tandem Fourier ion cyclotron resonance Mass Spectrometry." Analytical Chemistry 77(2):400-406. Abstract In proteimics, effective methods are needed for identifying the relatively limited subset of proteins displaying significant changes in abundance between two samples. One way to accomplish this task is to target for identification by MD/MS only the "interesting" proteins based on the abundance ratio of isotopically labled pairs of peptides. We have developed the software and hardware tools for online LC-FTICR MS/MS studies in which a set of initially unidentified peptides from a proteome analysis can be selected for identification based on their distinctive changes in abundance following a "perturbation". We report here the validation of this method using a mixture of standard proteins combined in different ratios after isotopic labeling. We also demonstrate the application of this method to the identification of Shewanella oneidensis peptides/proteins exhibiting differential abundance in sub-oxic vs. aerobic cell cultures.
2005. "Global Profiling of Shewanella oneidensis MR-1: Expression of Hypothetical Genes and Improved functional annotations." Proceedings of the National Academy of Sciences of the United States of America 102(6):2099-2104. Abstract The y-proteobacterium Shewanella oneidensis strain MR-1 is a respiratory versatile organism that can reduce a wide range of organics, metals, and radionuclides. Similar to most other sequenced organisms, approximately 40% of the predicted ORFs in the MR-1 genome were annotated as uncharacterized ‘hypothetical’ genes. We implemented an integrative approach using experimental and computational analyses to provide more detailed insight into their function. Global expression studies were conducted using RNA and protein expression profiling of cells cultivated under aerobic, suboxic, and fumaratereducing conditions, phosphate limitation and UV irradiation. transcriptomic and proteomic analyses confidently identified 538 ‘hypothetical’ genes as expressed in S. oneidensis cells both as mRNAs and proteins (33% of all ‘hypothetical’ proteins). Publicly available analysis tools and databases and our own expression data were applied to improve the annotation of these genes. The annotation results were scored using a seven-category schema that ranked both confidence and precision of the functional assignment. We identified homologs for nearly all of these ‘hypothetical’ proteins (96%), thus allowing us to minimally classify them as ‘conserved proteins’. Computational and/or experimental evidence provided more precise functional assignments for 297 genes (categories 1-4; 55%). These improved functional annotations will significantly widen our understanding of vital cellular processes including signal transduction, ion transport, secondary metabolism, and transcription, as well as structural elements, such as cellular membranes. We propose that this integrative approach offers a viable means to undertake the enormous challenge of characterizing the rapidly growing number of ‘hypothetical’ proteins with each newly sequenced genome.
2005. "Statistical Characterization of the Charge State and Residue Dependence of Low-Energy CID Peptide Dissociation Patterns." Analytical Chemistry 77(18):5800-5813. Abstract Data mining was performed on 28 330 unique peptide tandem mass spectra for which sequences were assigned with high confidence. By dividing the spectra into different sets based on structural features and charge states of the corresponding peptides, chemical interactions involved in promoting specific cleavage patterns in gas-phase peptides were characterized. Pairwise fragmentation maps describing cleavages at all Xxx-Zzz residue combinations for b and y ions reveal that the difference in basicity between Arg and Lys results in different dissociation patterns for singly charged Arg- and Lys-ending tryptic peptides. While one dominant protonation form (proton localized) exists for Arg-ending peptides, a heterogeneous population of different protonated forms or more facile interconversion of protonated forms (proton partially mobile) exists for Lys-ending peptides. Cleavage C-terminal to acidic residues dominates spectra from peptides that have a localized proton and cleavage N-terminal to Pro dominates those that have a mobile or partially mobile proton. When Pro is absent from peptides that have a mobile or partially mobile proton, cleavage at each peptide bond becomes much more prominent. Whether the above patterns can be found in b ions, y ions, or both depends on the location of the proton holder(s). Enhanced cleavages C-terminal to branched aliphatic residues (Ile, Val, Leu) are observed in both b and y ions from peptides that have a mobile proton, as well as in y ions from peptides that have a partially mobile proton; enhanced cleavages N-terminal to these residues are observed in b ions from peptides that have a partially mobile proton. Statistical tools have been designed to visualize the fragmentation maps and measure the similarity between them. The pairwise cleavage patterns observed expand our knowledge of peptide gas-phase fragmentation behaviors and should be useful in algorithm development that employs improved models to predict fragment ion intensities.
2005. "Enabling Proteomics Discovery Through Visual Analysis." IEEE Engineering in Medicine and Biology Magazine 24(3):50-57. Abstract With the completion of the Human Genome Project and the sequencing of large genomes, proteomics is the new big challenge. A proteome is the collection of all the proteins present in an organism at a given moment. Unlike the genome, the proteome is dynamic, changing continuously in response to tens of thousands of intra- and extra-cellular environmental signals. Proteomics is the study of proteomes under different conditions—for example, over time, under different environments, or in different disease states. Because proteins are the key actors in cellular processes and proteomics is the study of not one or two proteins at a time but whole proteomes, proteomics has a key role in revealing the complex processes of cells at a global or systems level. There are several high-throughput proteomics techniques; all generate data faster than the data can currently be analyzed. The tremendous size and complexity of the high-throughput experimental data make it very difficult to process and interpret. The success of proteomics will rely on high-throughput experimental techniques coupled with sophisticated visual analysis and data mining methods. This article presents the motivation for developing visual analysis tools for proteomic data and demonstrates their application to proteomics research with a visualization tool named Peptide Permutation and Protein Prediction, or PQuad. PQuad is a functioning visual analytic tool in operation at the Pacific Northwest National Laboratory for the study of systems biology. PQuad supports the exploration of proteins identified by proteomic techniques in the context of supplemental biological information.
2005. "Global detection and characterization of hypothetical proteins in Shewanella oneidensis MR-1 using LC-MS based proteomics based proteomics." Proteomics 5(12):3120-3130. doi:10.1002/pmic.200401140 Abstract The availability of whole genome sequences has enabled the application of powerful tools for assaying global expression patterns in environmentally relevant bacteria such as Shewanella oneidensis MR-1. A large number of genes in prokaryote genomes, including MR-1, have been annotated as hypothetical indicating that no similar protein has yet been identified in other organisms. Using high-sensitivity mass spectrometry coupled with accurate mass and time (AMT) tag methodology, 1078 tryptic peptides were collectively detected in MR-1 cultures, 671 of which were unique to their parent protein. Using only these unique tryptic peptides and a minimum of 2 peptides per protein, we identified, with high confidence, the expression of 258 hypothetical proteins. These proteins ranged from 3.5 kDa to 139 kDa, with 47 being 100 amino acid residues or less. Using a combination of information including detection in cells grown under specific culture conditions, presence within a specific cell fraction, and predictive algorithms such as PSORT and PSORT-B, possible/plausible functions are proposed for some hypothetical proteins. Further, by applying this approach a number of proteins were found not only to be expressed, but only expressed under certain culturing conditions, thereby suggesting function while at the same time isolating several proteins to distinct locales of the cell. These results demonstrate the utility of the AMT tag methodology for comprehensive profiling of the microbial proteome while confirming the expression of a large number of hypothetical genes.
2004. "Nanoscale Proteomics." Analytical and Bioanalytical Chemistry 378(4):1037-1045. Abstract This paper describes efforts to develop a liquid chromatography (LC)/mass spectrometry (MS) technology for ultra-sensitive proteomics studies, i.e. nanoscale proteomics. The approach combines high-efficiency nano-scale LC with advanced MS, including high sensitivity and high resolution Fourier transform ion cyclotron resonance (FTICR) MS, to perform both single-stage MS and tandem MS (MS/MS) proteomic analyses. The technology developed enables large-scale protein identification from nanogram size proteomic samples and characterization of more abundant proteins from sub-picogram size complex samples. Protein identification in such studies using MS is feasible from <75 zeptomole of a protein, and the average proteome measurement throughput is >200 proteins/h and ~3 h/sample. Higher throughput (>1000 proteins/h) and more sensitive detection limits can be obtained using a “accurate mass and time” tag approach developed at our laboratory. These capabilities lay the foundation for studies from single or limited numbers of cells.
2004. "Validation of Shewanella oneidensis MR-1 Small Proteins by AMT Tag-based Proteome Analysis." OMICS. A Journal of Integrative Biology 8(3):239-254. Abstract Using stringent criteria for protein identification by accurate mass and time (AMT) tag mass spectrometric methodology, we detected 36 proteins <101 amino acids in length, including 10 that were annotated as hypothetical proteins, in 172 global tryptic digests of Shewanella oneidensis MR-1 proteins analyzed. Peptides that map to the conserved, but functionally uncharacterized proteins SO4134 and SO2787, were the most frequently detected small proteins in these samples, while hypotheticals SO2669 and SO2063, conserved hypotheticals SO0335 and SO2176, and the SlyX protein (SO1063) were observed at frequencies similar to small expected abundant ribosomal proteins and translation initiation factor IF-1 and consequently, likely to encode important cellular functions. In addition, 30 proteins including three of the small proteins that map to genes predicted to encode frameshifts, point mutations, or recoding signals were detected. Of these 30 genes, peptides that map to positions beyond internal stop codons were detected in 13 genes (SO0101, SO0419, SO0590, SO0738, SO1113, SO1211, SO3079, SO3130, SO3240, SO4231, SO4328, SO4422, and SO4657). While expression of the full-length formate dehydrogenase encoded by SO0101 can be explained by incorporation of selenocysteine at the internal stop codon, the mechanism of translating downstream sequences in the remaining genes remains unknown.
2004. "Dissociation Behavior of Doubly-Charged Tryptic Peptides: Correlation of Gas-Phase Cleavage Abundance with Ramachandran Plots." Journal of the American Chemical Society 126(10):3034-3035. Abstract Large numbers of gas-phase dissociation spectra of protonated peptides are obtained daily and used in protein identification studies. Yet fundamental knowledge of the factors that influence their unimolecular dissociation branching ratios is relatively poor. It is still not possible to predict dissociation branching ratios from peptide sequence. Clearly, several chemicals factors must influence dissociation patterns, includes y, f angles determined by the residues involved in an amide bond, the propensities for certain side chains to interact each other or with the backbone, the tendency for added protons to be intramolecularly solvated, and the stability of the fragment ions once formed.
2003. "Use of artificial neural networks for the accurate prediction of peptide liquid chromatography elution times in proteome analyses." Analytical Chemistry 75(5):1039-1048. Abstract The use of artificial neural networks (ANNs) is described for predicting the reversed-phase liquid chromatography retention times of peptides enzymatically digested from proteome-wide proteins. In order to enable the comparison of the numerous LC-MS data sets a genetic algorithm was developed to normalize the peptide retention data into a range (from 0 to 1), improving the peptide elution time reproducibility to about 1%. The network developed in this study was based on amino acid residue composition and consists of 20 input nodes, 2 hidden nodes and 1 output node. A data set of about 7000 confidently identified peptides from the microorganism Deinococcus radiodurans was used for the training of the ANN. The ANN was then used to predict the elution times for another set of 5200 peptides tentatively identified by MS/MS from a different microorganism (Shewanella oneidensis). The model was found to predict the peptides of elution time with up to 54 amino acid residues (the longest peptide identified after tryptic hydrolysis of S. oneidensis) with an average accuracy of 3%. This predictive capability was then used to distinguish with high confidence isobar peptides otherwise indistinguishable by accurate mass measurements as well as to uncover peptide misidentifications. Thus, integration of ANN peptide elution time prediction in the proteomic research will increase both the number of protein identifications and their confidence.
2002. "Advanced Mass Spectrometric Approaches for Rapid and Quantitative Proteomics." Chapter 8 in Applied Electrospray Mass Spectrometry, ed. BN Pramanik,, AK Ganguly & ML Gross, pp. 307-360. Marcel Dekker, New York, NY. Abstract With the completion of several dozen genome sequences and the first draft of the human genome in 2000, biological research is moving rapidly into the "post-genomic era". Contributing to the movement towards this era are recent advances in robotics, DNA sequencing technology, and computational analysis, all of which are resulting in an increasingly large amount of DNA sequence data, along with an array of experimental and bioinformatic tools increasingly being used for its analysis. In the post-genomic era, studies will be designed to characterize complex cellular "systems", consisting of networks of molecular networks. In the new paradigm, cellular processes are increasingly subject to global study and modeled from the "top-down", leading to new understandings of the cellular functions of the individual system constituents, how they respond to environmental perturbations, and the emergent properties arising from the complex nature of their interactions. A major goal is to understand both the molecular and cellular processes provides a basis for understanding the robustness of cellular systems, the possible modular nature of the cellular machinery, the nature of epigenetic and multigenic diseases, individual variability is susceptibility to disease, and for developing predictive capabilities of the effects arising from external perturbations. Additionally, the information gained potentially leads to an understanding of the molecular "nodes" that can be targeted for drug development, gene therapy, genetic manipulations, etc.
2002. "The Use of Accurate Mass Tags for High-Throughput Microbial Proteomics." OMICS. A Journal of Integrative Biology 6(1):61-90. Abstract We describe and demonstrate a global strategy that extends the sensitivity, dynamic range, comprehensiveness, and throughput of proteomic measurements based upon the use of peptide accurate mass tags (AMTs) produced by global protein enzymatic digestion. The two-stage strategy exploits Fourier transform-ion cyclotron resonance (FT-ICR) mass spectrometry to validate peptide AMTs for a specific organism, tissue or cell type from potential mass tags identified using conventional tandem mass spec-trometry (MS/MS) methods, providing greater confidence in identifications as well as the basis for subsequent measurements without the need for MS/MS, and thus with greater sensitivity and increased throughput. A single high resolution capillary liquid chromatography separation combined with high sensitivity, high resolution and ac-curate FT-ICR measurements has been shown capable of characterizing peptide mix-tures of significantly more than 10 5 components with mass accuracies of _1 ppm, sufficient for broad protein identification using AMTs. Other attractions of the approach include the broad and relatively unbiased proteome coverage, the capability for exploiting stable isotope labeling methods to realize high precision for relative protein abundance measurements, and the projected potential for study of mammalian pro-teomes when combined with additional sample fractionation. Using this strategy, in our first application we have been able to identify AMTs for _60% of the potentially expressed proteins in the organism Deinococcus radiodurans.
2002. "An Accurate Mass Tag Strategy for Quantitative and High-Throughput Proteome Measurements." Proteomics 2:513-523. Abstract We describe and demonstrate a global strategy that extends the sensitivity, dynamic range, comprehensiveness and throughput of proteomic measurements based upon the use of polypeptide "accurate mass tags" (AMTs) produced by a global protein enzymatic digestion. The two stage strategy exploits Fourier transform ion cyclotron resonance mass spectrometry (FTICR) to first validate polypeptide AMTs for a specific organism, tissue or cell type from "potential mass tags" identified using conventional tandem mass spectrometry (MS/MS) methods, providing the basis for subsequent measurements without the need for MS/MS. A single high resolution capillary liquid chromatography separation combined with high sensitivity, high resolution and accurate FTICR measurements is shown to be capable of characterizing polypeptide mixtures of significantly more than 105 components with mass accuracies of <1 ppm, sufficient for broad protein identification using AMTs. Attractions of the approach include the capability for automated high confidence protein identification, broad and unbiased proteome coverage, the capability for exploiting stable-isotope labeling methods to realize high precision for relative protein abundance measurements, and the potential for study of mammalian proteomes when combined with additional sample fractionation. The strategy is demonstrated by selected examples using Saccharomyces cerevisiae, Deinococcus radiodurans, and mouse melanoma cells.
2002. "Gene expression profiling using advanced mass spectrometric approaches." Journal of Mass Spectrometry 37(12):1185-1198. Abstract In the era of systems biology, computational and high-throughput experimental biological approaches are increasingly being combined to provide global snapshots of entire genomes and proteomes under tissue- and disease-specific conditions. The aim is to identify proteins changing in concentration and/or post-translational state and/or location, and develop a better molecular level understanding of the operation of biological systems. Here we describe an approach for comparative proteomics that builds upon the combination of high-performance nano-scale separations with the high-mass measurement accuracy, mass-resolving power and sensitivity of Fourier transform ion cyclotron resonance mass spectrometry to provide broad dynamic range, comprehensive and quantitiative proteome measurements.
2002. "Charge Effects for Differentiation of Oligodeoxynucleotide Isomers Containing 8-oxo-dG Residues." Journal of the American Society for Mass Spectrometry 13:195-199. Abstract Dissociation reactions of a series of multiply charged oligonucleotides anions were studied using an ion trap mass spectrometer. These mixed-nucleobase 12-mers fragment first by loss of a nucleobase (A, G, C and/or 5-methyl-cytosine) followed by cleavage at 3' C-O bond of the sugar from which the base is lost to produce the complementary sequence ions, i.e. a-B and w type of ions. No detectable loss of 8-oxo-guanine and/or thymine from these 12-mers is observed for the gentle collision conditions in the ion trap. The primary loss of a nucleobase and the subsequent backbone cleavage to generate sequence ions strongly depend on the charge state of the parent molecular ion. For low charge states (-2 and ?3), product ions due to the loss of a neutral guanine base and related sequence ions are dominant in the tandem mass spectra. However, preferential loss of a neutral adenine becomes the primary reaction channel from the ?5 charge state of the molecular ion. Such charge state dependent fragmentation behavior was utilized to determine the sites of 8-oxo-dG residue in a series of structural isomers. The position of 8-oxo-dG residue can be simply determined from the fragmentation pattern of ?3 charge state, but not of ?5 charge state. The strategy illustrated here for positional mapping of damaged residues in oligonucleotides is highly sensitive due to effective dynamic range enhancement in the product ion spectra by accessing the sequence informative reaction channels.
2002. "Global Analysis of Deinococcus Radiodurans Proteome by Csing Accurate Mass Tags." Proceedings of the National Academy of Sciences of the United States of America 99(17):11049-11054. Abstract The ability to understand biological systems and their constituents would be greatly facilitated by the ability to make quantitative, sensitive, and comprehensive measurements of how their proteome changes e.g. in response to environmental perturbations. To this end we have developed new instrumentation and a high throughput methodology to characterize an organism's dynamic proteome based upon the combination of global enzymatic digestion, high-resolution liquid chromatographic separations and analysis by Fourier transform ion cyclotron resonance mass spectrometry. Using accurate mass tags, 61% of the predicted proteome of the ionizing radiation resistant bacterium Deinococcus radiodurans was characterized with high confidence. This represents the broadest proteome coverage for any organism to date, and includes 715 proteins previously annotated as either hypothetical or conserved hypothetical.
2002. "Evaluation of Enzymatic Digestion and Liquid Chromatography-Mass Spectrometry Peptide Mapping of the Integral Membrane Protein Bacteriorhodopsin." Electrophoresis 23(18):3224-3232. Abstract .
2002. "Enrichment of Integral Membrane Proteins for Proteomic Analysis Using Liquid Chromatography-Tandem Mass Spectrometry." Journal of Proteome Research 1(4):351-360. Abstract Currently, most proteomic studies rely on liquid chromatography-tandem mass spectrometry (LC-MS/MS) to detect and identify constituent peptides of enzymatically digested proteins obtained from various organisms and cell types. However, sample preparation methods for isolating membrane proteins typically involve the use of detergents, chaotropes, or reducing reagents that often interfere with electrospray ionization (ESI). To increase the identification of integral membrane proteins by LC-ESI-MS/MS, a sample preparation method combining carbonate extraction and surfactant-free organics solvent-assisted solubilization and proteolysis was developed and used to target the membrane subproteome of Deinococcus radiodurans. Out of 503 proteins identified, 135 were recognized as hydrophobic based on their positive grand average of hydropathicity values that covers 15% of the theoretical hydrophobic proteome. Using the PSORT algorithm, 268 identified proteins were recognized as integral membrane proteins covering 21% and 43% of the predicted integral cytoplasmic and outer membrane proteins, respectively. Of the integral cytoplasmic membrane proteins containing four or more predicted transmembrane domains (TMDs), 65% were identified by detecting at least one peptide spanning a TMD using LC-MS/MS. The extensive identification of highly hydrophobic proteins containing multiple TMDs confirms the efficacy of the described sample preparation protocol to isolate and solubilize integral membrane proteins and validates the method for large-scale analysis of bacterial membrane subproteomes using LC-ESI-MS/MS.
2001. "Rapid quantitative measurements of proteomes by Fourier transform ion cyclotron resonance mass spectrometry." Electrophoresis 22:1652-1668. Abstract N/A
2001. "Packed Capillary Reversed-Phase Liquid Chromatography with High Performance Electrospray Ionization Fourier Transform Ion Cyclotron Resonance Mass Spectrometry for Proteomics ." Analytical Chemistry 73(8):1766-1775. Abstract In this study, high efficiency packed capillary reversed-phase liquid chromatography (RPLC) coupled on-line with high-performance Fourier transform ion cyclotron resonance (FTICR) mass spectrometry has been investigated for the characterization of complex cellular proteolytic digests. Long capillary columns (80-cm) packed with small (3-um) C18 bonded particles provided a total peak capacity of ~1,000 for cellular proteolytic polypeptides when connecting to FTICR mass spectrometer through electrospray interface under composition gradient conditions at a pressure of 10,000 psi. Two-dimensional analyses from combination of packed capillary RPLC with high-resolution FTICR yield a combined capacity for separations of >1 million polypeptide components, and simultaneously provided information for the identification of the seperated compents based upon the accurate mass tag concept previously described. For Deinococcus radiodurans tryptic digests, ~48.6000 polypeptides from ~60,7000 isotopic distributions have been detected in a single run. Large amount (e.g., 500 ug) of cellular proteolytic digests could be loaded on packed capillaries (e.g., 150-um inner diameter) without significant loss of speration efficiency. Pre-columns with suitable inner diameters were found useful for improving the elution reproducibility and without significant loss of seperation quality. Porous particle packled capillaries are more favorable than those containing non-porous particles because of their high sample capacity even for seperations of cellular proteolutic polypeptides with a moderate amount sample loading (e.g., 50 ug protein content).
2001. "High-Throughput Peptide Identification from Protein Digests Using Data-Dependent Multiplexed Tandem FTICR Mass Spectrometry Coupled with Capillary Liquid Chromatography." Analytical Chemistry 73(14):3312-3322. Abstract Tandem mass spectrometry (MS/MS) plays an important role in the unambiguous identification and structural elucidation of biomolecules. In contrast to conventional MS/MS approaches for protein identification where an individual polypeptide is sequentially selected and dissociated, a multiplexed MS/MS approach increases throughput by selecting several peptides for simultaneous dissociation using either infrared multiphoton dissociation (IRMPD) or multiple frequency sustained off-resonance irradiation (SORI) collisionally induced dissociation (CID). The high mass measurement accuracy and resolution of FTICR combined with knowledge of peptide dissociatioin pathways allows the fragments arising from several different parent ions to be assigned. Herein we report the application of multiplexed MS/MS coupled with on-line separations for the identification of peptides present in complex mixtures (i.e., whole cell lysates). Using "on-the-fly" data-dependent peak selection of a subset of polypeptides from each FTICR MS acquisition. In the subsequent MS/MS acquisitions, several co-eluting peptides were fragmented simultaneously using either IRMPD or SORI-CID techniques. The utility of this approach has been demonstrated using a bovine serum albumin tryptic digest separated by capillary LC where multiple peptides were readily identified in single MS/MS acquisitions. We also present initial results from multiplexed MS/MS analysis of a D. radiodurans whole cell digest to illustrate the utility of this approach for high throughput analysis of a complex bacterial proteome.
2001. "Quantitative Analysis of Bacterial and Mammalian Proteomes using a Combination of Cysteine Affinity Tags and 15N-Metabolic Labeling." Analytical Chemistry 73(9):2132-2139. Abstract N/A
2001. "Experiences and Management of Pregnant Radiation Workers at the Pacific Northwest National Laboratory." Chemical Safety and Health 8(3):6-8. Abstract Radiation workers at the Pacific Northwest National Laboratory are divided into two classes based on whether or not they can encounter radioactive contamination in the normal course of their work. Level I workers primarily handle sealed radioactive materials such as those used to calibrate detectors. Level II workers perform benchtop chemistry. The U.S. Department of Energy has strict guidelines on the management of pregnant radiation workers. Staff members may voluntarily notify their line managers of a pregnancy and be subjected to stringent radiation exposure limits for the developing fetus. The staff member and manager develop a plan to limit and monitor radiation dose for the remainder of the pregnancy. Several examples of dose management plans and case examples of the impact of pregnancy on staff member?s technical work and projects will be presented.
2000. "Characterization of Microorganisms and Biomarker Development from Global ESI-MS/MS Analyses of Cell Lysates." Analytical Chemistry 72(11):2475-2481. Abstract The capability for rapid and accurate identification of microorganisms has potential applications that would include the monitoring of industrial bioprocessing operations, food safety analyses, disease diagnosis, and the detection of potential biological hazards. Efforts based upon matrix assisted laser desorption ionization mass spectrometry (MALDI-MS) to detect and identify specific microorganisms have been actively pursued for several years. We report a new method being developed to select useful biomarkers for the identification of microorganisms based upon electrospray ionization (ESI)-ion trap mass spectrometry. Crude cell lysates are processed using a recently developed dual microdialysis device and then directly infused into an ion trap MS. The low ESI flow rate and precursor ion accumulation capability of the ion trap MS enables high sensitivity MS/MS analyses. Precursor ions were automatically selected and analyzed using tandem MS (MS/MS) to produce a "global" MS/MS surveys, for two-dimensional data displays. Such global MS/MS surveys are demonstrated for Escherichia coli lysates. The distinctive MS/MS spectral patterns can be used to identify mass spectrometric signals useful as biomarkers, which then provide a basis for microorganism identification. The results presented here show the application of this method for the identification of microorganisms, as well as for detection of bacteriophage MS2 in the presence of a large excess of Escherichia coli.
2000. "Mass Spectrometric Detection for Capillary Isoelectric Focusing Separations of Complex Protein Mixtures." Electrophoresis 21(7):1372-1380. Abstract Capillary isoelectric focusing (CIEF) can provide high-resolution separations of complex protein mixtures, but until recently it has primarily been used with conventional UV detection. This technique would be greatly enhanced by much more information-rich detection methods that can aid in protein characterization. We describe progress in the development of the combination of CIEF with Fourier transform ion cyclotron resonance (FTICR) mass spectrometry and its application to proteome characterization. Studies have revealed 400-1000 putative proteins in the mass range 2-100 kDa from total injections of ~300 ng protein in a single CIEF-FTICR analysis of cell lysates for both Escherichia coli (E. coli) and Deinococcus radiodurans (D. radiodurans). We also demonstrate the use of isotope labeling of the cell growth media to improve mass measurement accuracy and provide a means for proteome-wide measurements of protein expression. The ability to make such comprehensive and precise measurements of differences in protein expression in response to cellular perturbations should provide new insights into complex cellular processes.
1999. "High Throughput Proteome-Wide Precision Measurements of Protein Expression using Mass Spectrometry." Journal of the American Chemical Society 121(34):7949-7950.
1999. "Probing Proteomes Using Capillary Isoelectric Focusing-Electrospray Ionization Fourier Transform Ion Cyclotron Resonance Mass Spectrometry." Analytical Chemistry 71(11):2076-2084. Abstract We describe progress in the development and initial application of the poweful combination of capillary isoelectric focusing (CIEF) and Fourier transform ion cyclotron resonance (FTICR) mass spectrometry for measurements of the preteome of the model system Escherichia coli.