Now that you know what MassSorter is (and is not), how to start using
MassSorter? This tutorial explains the basic features
and procedures, and makes you able to understand the essential ideas behind
Before starting to use MassSorter you need the amino
acid sequence of the protein you want to study. In this tutorial we
will use an example sequence.
Comparing experimental and theoretical data is the main idea in MassSorter.
This information is represented in a Project. To create a new Project, select
"New Project" from the "File" menu in the main window. A new Project wizard
then appears which will guide you through the process of creating a Project.
The wizard consists of four steps.
The first step is inserting the Project details. Provide a Project name
and description. Only the name is mandatory, but inserting a description is
highly recommended. Click on "Next >" to continue.
The second step is providing the theoretical data. Here you have two
choices; you can either create a new theoretical data file from scratch using
MassSorter's own tool ProteinDigester, or you can select one from the list
of the allready created data files. In this tutorial we will we will choose
the first option. Click on the "ProteinDigester" button. This
will open a protein digester window. Select "Import sequence" from the
"File menu", select the text file named "BSA" and click on "Open". Now
you can select the parameters for the digestion. In the modifications
list select the modifications named C-am, gK, oM and pyrE. If you want
to see more information about the modifications right click on the given
abbreviation in the first column. Leave the other parameters on the default
settings and click on "Digest protein".
The ProteinDigester window then disappears, and the newly created data
file is inserted in the list of available data files and selected. To see the
contents of the file, right click on the given row and select "Preview Theoretical
Data File" from the popup menu. A window appears containing the data about the theoretical
digestion. Each line contains information about one peptide: the
mass m/z value, start- and end positions
in the protein sequence, modifications applied, the number of missed
cleavages and the amino acid sequence of the peptide.
When doing MS experiments there is a possibility that the
samples may contain other proteins than the one you are
studying, for example, keratin or parts of the enzyme
used for digestion. To avoid disturbances due to these non-relevant peptides you can
add a filter that will filter out m/z values that may come
from such contaminants. To do this: on the
"Edit menu", select "Filter(s)". This will open a window
where you can select allready created filers or create new ones.
Select the filter named "TrypsinNoise" and click on "Update".
The selected filter will then be added to the list of theoretical
data (for example row 16). Close the theoretical data window (and
be sure to select "Yes" when asked about saving). Then click on "Next>".
The third step is the importing of experimental data. Again
you have two choices; either import new experimental data files or select
one or more from the list of allready available data files. In this tutorial
we will start by importing new experimental data. Click on the "Import Experimental
Data" button. A new window then presents three choices: Delimited text file, XML file and
Cut and Paste. Delimited text file is selected by default. Keep this choice
and click on "Next >".
A new window where you can select the file details then appears. Click
on the folder icon next to the "File" text field and select the file
"bsa-10pmol _0001-Spec.pkt" from the list. Preview of the selected file
is then shown, both the raw text file and the imported version. Make
sure that the correct column numbers are selected for the mass and the
intensity (in this case it should be 2 and 7), and click on "Next >" to
The last import window then appears. Insert the correct protein name
(in this case "BSA"), make sure that the correct enzyme is selected
(in this case "Trypsin") and insert any comments if wanted. Click on "Edit List"
next to the "Considered modifications"-frame.
A window with the available modifications appears. Select
C-am, gK, oM and pyrE, click "Add >" and "OK". The
modifications selected are those expected in
the given experiment. This list may vary from
experiment to experiment. Finally click on "Import". (For this tutorial
the remaining experiments are allready imported, so on the "Do you
want to import more data?"-question, click on "No".)
The experimental data is then imported into MassSorter, inserted into
the list of available experimental data files and selected. To see the
contents of the file, right click on the given row and select "Preview Experimental
Data File" from the popup menu.
During the import procedure described in above, it is possible to manually
edit the data, e.g., to remove peaks that the user has recognized as noise,
or to add a peak that the spectrum analysis program has not recognized. To
delete: Mark the row (it becomes blue), go to the "Edit" menu and select "Delete row".
Alternatively, it can be deleted by "ctrl+D". To add: Mark the row above where you want
to insert a new row, go to the "Edit" menu and select "Insert row after", alternatively
use "ctrl+R". An empty row is inserted, and you can put in the relevant m/z value
(remember to use "." as decimal point). None of the other cells need to be filled in.
Close the experimental data window to get back to the Project wizard.
From the list of available experimental data files, select the three remaining
experiments on protein "BSA" and click on "Next >".
The final step has two purposes. Here you get an overview of
the data you have selected to be included in the Project and you also
have to choose
the accuracy to be used for the comparison of theoretical and experimental
data. For this experiments select the accuracy type as "ppm" and an
accuracy value of 100. Then click "Finish" to complete the creation of the
The main view of a Project is a spread sheet containing all the comparisons of the
experimental and theoretical peptides. The logic behind the spread sheet is as
follows: Each experimental peptide's m/z value is first compared to
the theoretical m/z values. If a match is found within the given
accuracy limit, the program checks to see if the given theoretical
peptide is modified. If it is, the modification(s) also has to be
possible/realistic for the given MS-experiment. If the modification(s)
is possible in the given MS-experiment, or the theoretical peptide
is not modified, the two peptides are considered "equal" and put on
the same line in the table. Row 3 (unmodified) and row 2 (modified)
are examples of this. The first 7 columns are
data from the theoretical peptide. Then follows three columns per
MS experiment. The second column per MS-experiment contains the
accuracy between the theoretical and experimental m/z values.
As you can see the two mentioned rows are colored green. An experimental
m/z value may also match a "filter-mass". These are colored grey,
see row 15 for an example.
If an experimental m/z value does not match any of the theoretical
m/z values it is compared to the m/z values from the other
MS experiments if any, and placed on the same row if they are within the
selected accuracy limit.
These are colored yellow, see row 8.
The spread sheet can also color-code the experimental values according to the detected
intensities. Select "Intensity grading" on the "View" menu. The experimental
values are then divided into three groups and each group is given a specified
color. The default colors are different shadings of green where the most
intense has the darkest shading. The colors used can be altered by selecting
"Edit color" on the "View" menu. Before continuing, turn the color-coding
back to normal by deselecting "Intensity grading" on the "View" menu.
When comparing the m/z values, it is possible to get more than one
match (within the accuracy limit) for a given experimental m/z value.
The best match (smallest difference) is automatically selected
as a "primary match" and the others are labeled "secondary
matches". If the match automatically selected as primary is for
some reason wrong, you can select one of the others. First make
the secondary matches visible by deselecting "Hide secondary matches"
from the "View menu". The secondary matches are colored dark green.
Choose one of the secondary matches, i.e. one of the dark green cells,
and right click on the corresponding
third column of the secondary match. A window appears where you
can choose the match you want as a primary match or remove the
matches all together. NB: Removing all the matches is irreversible!!
Now that you understand the logics of the spread sheet, it is time to start the
real analysis. The goal is to minimize
the number of yellow cells. This is the same as maximizing the number of
m/z-values from the experiments that we can identify with good confidence.
First we will look at the DST in a different way. On the Tools menu select
"Report". The information included in the spread sheet is compressed
into an html-file where (for each experiment) the matches are divided into
different categories; matches with unmodified theoretical peptides,
matches with modified theoretical peptides, matches with filter(s) and
so on. Other statistics are also shown; like % match (of all the m/z-values
in the given experiment, how many match with theoretical values within the given accuracy limit)
and sequence coverage. The sequence coverage is also shown on a model
of the sequence. The red parts are the covered parts. Underscored residues
are residues that may be modified. By right clicking on a covered residue
information about the peptides containing the selected residue is shown.
(Modification details can be accessed in the same way.)
The Report only contains a 2D model of the amino acid sequence of
the protein. A 3D model is available by clicking on the "View as
3D model" link. A file chooser appears where you must select a
PDB from which the model is created. The structural information from
the PDB file is then coupled with the coverage data from the Report
and a 3D model is created. The 3D model uses the same color-coding
scheme as in the Report, but can also be extended to coloring modifications,
residues and amino acids.
There are many ways to increase the number of matched m/z values. One way is
to include lots of modifications in the theoretical
digestion and make them all possible in all the MS experiments.
This will probably make the digestion and comparison significantly slower
and also create many incorrect matches, simply by chance, and much
work must be done to find the correct ones.
A better approach is therefore to only include the modifications
that are expected and test for others later. MassSorter includes
a database called UniMod (www.unimod.org) that contains data on,
at the time of writing, 192 different modifications. To
search this database for modifications, right click on one of the
yellow masses, select "Modification search", and click on
"Search". A list of possible modifications that may explain the
unmatched m/z value is shown. The list is created as follows:
All the theoretical m/z values between "Search mass + lower limit"
and "Search mass - upper limit" are compared to the search mass and
the difference is calculated. This difference is compared to the
list of mass changes from all the modifications in the UniMod
database. If the difference between the 'theoretical m/z value'
+ 'the mass change of a modification' and the experimental m/z value is
within the accuracy limit, we have a possible match. If you click on
"Insert into DST" on one of the modifications, it is inserted into the
DST, and the row is colored blue. A match inserted in this way can
be removed by right clicking on the given mass and selecting "Remove match".
When you have finished testing this feature, close the search window.
Another way of increasing the number of indentified m/z values, is to check for
"non-theoretical" cleavage sites. When MassSorter digests an amino
acid sequence it only cleaves at the theoretically correct sites of the enzyme selected. For
example, trypsin cleaves after R and K. When digesting in experiments the
enzyme sometimes cleaves at other sites as well or a peptide may be sensitive to
chemical cleavage. These two cases, combined or alone,
may result in peptides that have one or two terminals that don't
match any theoretically digested peptides.
To search for these kinds of peptides:
right click on one of the yellow masses and select "Suggest
sequence(s)". A window similar to the ProteinDigester appears.
Leave the parameters as they are and click on "Suggest sequences".
A list of the possible peptides from the given protein sequence,
with non-theoretical cleavage sites,
appears. If you click on a row in the table, the selected
peptide will be marked as blue in the frame in the upper right.
The red parts of this sequence are the already covered parts.
After selecting a row, the match can be inserted into the DST
by selecting "Insert selected mass into DST" from the "File menu".
These matches can be removed by right clicking on the given mass
and select "Remove match". When you are done testing this feature,
close both the SequenceSuggester and the results windows.
If you want to look for modifications in an
experiment and that modification was not included in the
theoretical digestion or in the list of possible modifications
for the experiment, you have to update both the theoretical
and experimental data files. The theoretical data file can
be changed by right clicking on the header of the column
in the spread sheet labeled "Theoretical" and selecting "View
Theoretical Data" from the popup menu. If you want to completely
change the data, select "Re-digest" from the "Tools" menu. The
experimental data can be altered in the same way by clicking on
the column in the spread sheet labeled with the experiment name.
Adding or removing experimental data can be done by "Experimental Data"
from the "Edit" menu. This window can also be used to change the order of
the experimental data files in the spread sheet.
In addition to the tools for finding the origin of the unmatched
peptide masses, MassSorter also includes some statistical tools.
MassSorter can calculate three types of statistics: peptide statistics
and two kinds of accuracy statistics. The peptide statistics are calculated
for all the peptides in the DST and are created by selecting "Peptide
statistics" from the "Statistics" sub menu on the "Tools menu". It contains
statistics for the following peptide properties: hydropathy, peptide length,
cleavage site frequencies and amino acid frequencies.
The first kind of accuracy statistics is calculated by selecting
"Accuracy statistics" from the "Statistics" sub menu on the "Tools menu".
It contains accuracy statistics for all the peptides in the DST. The last
kind of statistics available, is individual accuracy statistics for each
experiment. These are accessed by right clicking on one of the accuracy
values for the wanted experiment and selecting "Accuracy statistics".
The accuracy statistics can also be visualized as a plot of the m/z values
against the accuracy values. This option is available by either right
clicking on an accuracy value in the DST table (single experiment plot) or by
selecting "Accuracy plot" on the "Tools -> Statistics" menu (multiple experiments
in one plot).
That is the end of the tutorial. If you want more help, click on the
help icons in the sub windows.
Go to top of page
Download MassSorter v3.1
How to Install MassSorter
Example Scenario: MMP2
Licence and Copyrights
Papers About MassSorter:
Methods in Molecular Biology:
Papers Using MassSorter: