Protein identification

Identification of proteins from total or fractionated extracts is a standard application of LC-MS/MS. Two approaches can be performed depending on the genomic data (proteins or ESTs) available for the species under study:

1- The “mass matching” approach (also called fragmentation fingerprint) involves the idnetification of peptides based on the similarity between experimental and theoretical MS2 spectra : experimental masses measured for fragmented peptides are compared with theoretical masses obtained after in silico digestion and fragmentation of a protein sequence database. Peptide identification is thus performed based on the similarity between experimental and theoretical MS2 spectra. Proteins are then inferred from the list of identified peptides.

  • Advantage : Protein identification can be done even in very complex samples.
  • Limit : A peptide can be identified only if its exact sequence is present in the protein sequence database.
  • Softwares used: X!Tandem for peptide identification and X!TandemPipeline for protein inference.

2- The “De Novo” approach involves a direct and automatic interpretation of MS2 spectra into primary sequences of amino acids. Although incomplete, these sequences bear enough information to identify proteins by sequence homology.

  • Advantage : The functions of the proteins present in a sample can be inferred even if no sequence of the species under study is present in the databases.
  • Limits : Only high quality MS2 spectra can be interpreted into amino acid sequences. Moreover, the analysis of complex samples is difficult and time-consuming so that this approach is restricted to the analysis of 2D spots. Isobaric amino acids or amino acids of very close masses are indistinguishable (e.g. I=L, F=Mox, Q=K).
  • Sofwares used : Peaks Studio or Pepnovo for automatic interpretation of spectra; Fasts or MSBLAST for the sequence alignment (both of them allow to align a set of peptides on a protein sequence present in a database).

Protein quantification

PAPPSO generally performs label-free protein quantification. To obtain protein abundances from MS data acquired at the peptide level, three complementary approaches are used :

1- The XIC-based quantification summarizes the peptide intensities measured by the mass sprectrometer into protein abundances. Several methods can be employed to do so (see Blein-Nicolas & Zivy (2016) for more information).

  • Advantage : High precision of quantification.
  • Limits : Requires a high level of data processing to filter outliers, handle missing data and compute protein abundances.
  • Sofware used : MassChroq for peptide quantification and the R package ‘MCQR’ for data processing.

2- Spectral counting is a semi-quantitative approach where proteins are quantified based on the number of MS2 spectra attributed to them after the identification step.

  • Advantage : No specific software is required since the spectral counts are directly available in the X!TandemPipeline outputs. No missing values.
  • Limits : The precision of quantification, and thus the discrimination power, are lower than in the XIC-based approach. Problems related to data sparcity (i.e. many values at zero) can occur in some cases.

3- Peak counting is another semi-quantitative approach where proteins are quantified based on the number of chromatographic peaks (or peptide ions) attributed to them after peptide identification and quantification.

  • Advantage : Data are more complete than with the spectral counting approach and their processing more simpler than with the XIC-based approach.
  • Limits : The precision of quantification, and thus the discrimination power, are lower than in the XIC-based method. The use of a quantification software is required.
  • Sofware used : MassChroq for peptide quantification and R package MCQR for data processing.

Characterization of post-translational modifications

PAPPSO has a strong experience in the analysis of the phosphoproteome by dimethyl labeling, peptide fractionation and phosphopeptide enrichment. The plateforme also performs glycoproteome analyzes, disulfide bridge analyzes and characterization of the N-terminus of proteins.