Spectral library searching uses an algorithm that can take advantage of measured peak heights (empirical data). The spectral libraries are also smaller than an in silico digest of a whole proteome sequence file. These factors may make spectral library searching more sensitive and much faster than traditional sequence searching for your particular applications. Mass spectral library searching has also been around longer than proteomics and is the "gold standard" for interpreting unknown spectra in other fields of mass spectrometry (e.g., GC-MS); only recently has enough data been compiled and analyzed to generate reference libraries for peptides.
No. You will need to download a library search software package to do that. Look here for some choices. The website offers download and browsing of the libraries by protein or peptide but does not allow searching of spectra. The SpectraST site at the Institute for Systems Biology does allow you to do that.
Yes, by using the web-based library browser utility for the organism and instrument type you are interested in (e.g., human [ion trap]).
You will additionally need a spectral library search engine; a few are listed here. We'd reccomend either MSPepSearch (NIST) or SpectraST (ISB) since these were designed for use with the NIST libraries. These programs take a large set of MS/MS spectra (e.g., in mzXML or MGF format), such as those generated by LC-MS/MS analysis of a tryptic digest, and score them against the library spectra in "batch" mode, returning zero or more matches per unknown spectrum. These tools typically operate on the command-line and are therefore good for integrating into data analysis pipelines, or have been wrapped by web or other graphical user interfaces.
We collect data files from both internal and external sources. The
libraries were built from a large collection of >30 million spectra
collected by analyzing a wide range of sample types. We annotate each
dataset and store the information in a database and within data records to
You can view selected annotation information by clicking on the Exps./Samples link on the main page for the library of interest. For example, here is the human ion trap library annotation page. You can also download text files with this information from our ftp site.
You may use a "target-decoy" approach similar to what is used for sequence searching (Elias and Gygi, Nat. Meth., Mar. 2007). This has been demonstrated by generating decoy spectra for spectral library searches by Lam et al, J. Prot. Res., Jan. 2010. Additionally, any set of non-overlapping spectra (e.g, from another organism) may be used as decoy by adjusting for any "target-decoy" bias. To generate the figure below, a set of human spectra were searched against a combined library of human spectra and non-overlapping, non-human spectra chosen at random from other libraries. The black bars represent the fraction of matches to the human spectra and the red to decoy spectra. At ranks >3 for any set of matches, the matches are expected to be near random. Therefore, any deviation from 50:50 can be described as a "target-decoy" bias. In the example below the bias factor would be roughly 62/38 or 1.6 in favor of the target spectra. This value can then be used to scale the number of decoy (false positive) matches when calculating a FDR.
While the absolute Score threshold is subject to change based on the size and quality of your dataset, a Score of 450 is a reasonable starting point. This value will frequently approximate a false discovery rate of ~1% for a routine shotgun analysis of tryptic peptides on an ion trap mass spectrometer.
Coverage value, calculated by mapping all of the peptides exhaustively to the fasta file used to build a library, can be viewed in the browser by clicking on 'Library statistics' on the upper navigation bar of the on-line Browser.
While there is no guarantee that a particular peptide will be in the library (and therefore not found when searching), the libraries have been populated by many mass spec experiments (>500). And while the coverage seems "low" in relation to all protein sequences, the limits of the mass spec to sample low abundance peptides is the current bottle neck. Or it may be true that your data represent a tissue for which we have little or no data. In these cases, the tissue specific peptides may not be in the library. We are working hard to collect data from a variety of new sources to remedy the "false negative" problem.
However, for routine 1D or 2D analyses of human plasma, yeast or E. coli (well populated libraries) you will likely find very few peptides that are not in the library and may be surprised to find more peptides than seen with traditional sequence searching because of the differing search methods. To be safe, we reccommend combining library searching with sequence searching until you can evaluate the performance of library searching in your experimental workflow.
Peptide tandem mass spectra can be donated in RAW or peaklist format for almost any instrument type and for many organisms. All you have to do is contact us and we'll send you simple instructions and a few brief questions. The answers to your questions will help us search your data. We will be glad to reference your contribution in the next edition of the library, or, if you prefer, you may keep your name and sample information anonymous.
|NIST statement on privacy, security, and accessibility.|