Sanitize PSM

Sanitize PSM

Description

Analyze PSM lists and remove hits to scans for which another hit with a better score exists. In addition, hits to a scan with the same score but with contradicting peptides are discarded.

Input files

  • at least 1 OMSSA results file (.csv)

Output files

  • Sanitized PSM list (sanitized.csv)
  • Discarded PSM list (discarded.csv)

Context



Synopsis

This script addresses the problem of ambiguous MS/MS scan identifications where multiple, different peptides are assigned to one MS/MS scan. As a result, individual peptide/spectral matches can get discarded for the following three reasons.

1. A better-scoring peptide has been identified

In the following example, two different peptides have been identified in the same MS/MS scan:

RowScanPeptideE-value
1example.1000.1000.2GDDLGGNAAMSVYTK2.6e-9
2example.1000.1000.2GDDLGGNAVCSVYTK7.2e-3

Here, peptide 2 gets discarded because peptide 1 has a better score and is therefore more likely to be the correct explanation for the MS/MS scan.

2. Low hit distinctiveness

In the following example, two different peptides have been identified in the same MS/MS scan. The score of peptide 2 is worse but almost as good as the score of peptide 1:

RowScanPeptideE-value
1example.1000.1000.2GDDLGGNAAMSVYTK1.30e-9
2example.1000.1000.2GDDLGGNAVCSVYTK1.31e-9

The hit distinctiveness threshold can be used to specify what is considered to be "almost as good" in terms of order of magnitude. A value of 2 would correspond to a factor of 102, thus requiring the score ratio P2/P1 to be ≥ 100. In this example, the score ratio is only 1.0077 and therefore, peptide 1 would be discarded because its identification is not distinctive enough. In addition, peptide 2 would be discarded according to rule #1. Consequently, no identifications for this scan would remain.

Parameters

Hit distinctiveness threshold

Default: 2.0

Treat modified residues unmodified
If this flag is activated, peptides will be converted to upper case, thereby removing PTM information while filtering.

Default: true

Source code

filter-psm-sanitize.rb, filter-psm-sanitize.yaml (GitHub)