NIST Libraries of Peptide Tandem Mass Spectra

What they are

NIST peptide libraries are comprehensive, annotated mass spectral reference collections from various organisms and proteins useful for the rapid matching and identification of acquired MS/MS spectra. Spectra were produced by tandem mass spectrometers using liquid chromatographic separations followed by electrospray ionization. Unlike the NIST small molecule electron ionization library which contains one spectrum per molecular structure, there are several different modes of fragmentation (ion trap and ‘beam-type’ collision cells are currently the most commonly used fragmentation devices) that result in spectra with different, energy dependent, patterns. These result in multiple spectral libraries, distinguished by ionization mode, each of which may contain several spectra per peptide. Different libraries have also been assembled for iTRAQ-4 derivatized peptides and for phosphorylated peptides. Separating libraries by animal species reduces search time, although investigators may elect to include several species in their searches.

Currently, there are >4.3 million spectra in the libraries, representing 1.26 million different entities (derivative-peptide sequence-fragmentation mode). Several of the largest libraries resulted from data collected by laboratories collaborating in the National Cancer Institutes Clinical Proteomic Tumor Analysis Consortium (CPTAC)[see http://proteomics.cancer.gov/]. CPTAC laboratories conducted more than 6,000 2D-LC/MS/MS runs of human tumor and human-mouse xenograft samples producing >91 million MS/MS spectra. To reduce the variability introduced by disparate data analysis platforms, NIST created a Common Data Analysis Platform (CDAP). In addition to the results reported and publicly accessible at https://cptac-data-portal.georgetown.edu/cptacPublic/, NIST assembled libraries of well-identified and annotated spectra for human and mouse, underivatized, iTRAQ, and phospho-iTRAQ peptides.

Why they are useful

A reference spectrum library provides a sensitive, reliable, fast, and comprehensive resource for peptide identification, taking advantage of previously encountered, identified, and annotated data.
A peptide mass spectrum library can be used for:
- Direct peptide identification
- Validating peptides identified by sequence search programs
- Organizing and identifying recurring, unidentified spectra.
- Sensitive, high reliability detection of internal standards, biomarkers, and target proteins
- Subtracting a component from a mixture spectrum

How to use libraries

The libraries are provided in two formats - NIST MS Search binary format and ASCII text format ( MSP file). You can use MSP files with many software programs. The NIST MS Search binary libraries are directly usable with the NIST MS Search and PepSearch programs.

Available Windows Software for Peptide Mass Spectral Libraries

MS PepSearch - a fast library search algorithm for batch identification of peptides with a graphical user interface. Accepts user mgf input files and outputs text lists of matches.
NIST MS Search - a full spectral library search and viewing utility with graphical user interface. Accepts multiple format input files (including mgf); output is msp text.
Lib2NIST: A utility for converting NIST MSP file formats to a peptide library
MS_Piano - a software tool for annotating peaks in mass spectra of peptides and glycopeptides
NIST MSQC Pipeline - a fully integrated software pipeline for identification and quantitation of peptides (Note:Support is discontinued).
ReAdW4Mascot2: Thermo RAW data converter (direct download).
Other compatible software (external links)
1. SpectraST: a fast library search algorithm integrated into the TPP (developed at ISB and HKUST).
2. Skyline: a software package for design and analysis of SRM experiments (developed at UW).

Peptide Mass Spectral Libraries

You can download different libraries here: NIST Libraries of Peptide Tandem Mass Spectra

User Tools

Site Tools

Sidebar