Utilities for plotting 3D mass spec data, XIC etc from mzML data files
Project description
MS Plotting utilities
These scripts provide some simple utilities for mining and plotting MS data. The data is expected to be available in mzML or the openMS featureXML format.
Script files can be run as standalone apps or as a library.
MSPlot.msplot3d
Command Line Usage: msplot3d.py [options]
- Options:
- -h, --help
show this help message and exit
- -m MZML, --mzml=MZML
Input mzML file
- -f FEATURE, --feature=FEATURE
comma separated list of featureXML files
- -o OUTFILE, --output=OUTFILE
output filename (default=’plot.pdf’)
- -s, --showX11
Show X11 interactive plot.
- -x XROT, --xrot=XROT
Xaxis rotation for plot (default (30)
- -y YROT, --yrot=YROT
Yaxis rotation for plot (default: -45)
- -l CLIP, --limit=CLIP
Clip threshold for plot
- -c COLOURS, --colours=COLOURS
list of colours with which to plot features. Colours will be recycled when needed (default=’r,g,m,c,y,k’)
- -r RTMIN, --rtmin=RTMIN
minimum retention time
- -R RTMAX, --rtmax=RTMAX
maximum retention time
- -t MZMIN, --mzmin=MZMIN
minimum m/z value
- -T MZMAX, --mzmax=MZMAX
maximum m/z value
- -w RTWINDOW, --rtwindow=RTWINDOW
Window around plot area in which to identify feature peaks
- -p, --plotms2
Plot MS2 events
- -d MS2WIN, --ms2win=MS2WIN
MS2 ion capture window size
Module usage: mzml is the only non-named argument. All arguments as per the command line version except that lists should be provided as arrays rather than comma-delimited text.
>>> import MSPlot.msplot3d >>> MSPlot.plot3d(mzml, featfiles=[], outfile='plot.pdf', show=False, xrot=30, yrot=-45, featcols=['r','g','b','m','c','k'], thresh=100, minrt=None,maxrt=None,minmz=None, maxmz=None, ms2win=2.0, rtwindow=20.0, plotms2=True)
MSPlot.ms1pep
Module Usage:
>>> import Unimod.unimod >>> import MSPlot.ms1pep >>> peptide='ACDEFGHIKLMNPQRSTVWYKKACDPRFGHI' # protein amino acid sequence
>>> frags=MSPlot.ms1pep.digestprotein(peptide, enzyme=1, overlap=True, unfavoured=False) # overlap - include 1 missed cleavage # unfavoured - include unfavourable sites (see the EMBOSS documentation for more details) # enzyme - numerical enzyme selection according to the EMBOSS documentation Enzymes and Reagents 1 : Trypsin 2 : Lys-C 3 : Arg-C 4 : Asp-N 5 : V8-bicarb 6 : V8-phosph 7 : Chymotrypsin 8 : CNBr >>> frags [{'start': '1', 'end': '15', 'sequence': 'ACDEFGHIKLMNPQR'}, {'start': '16', 'end': '23', 'sequence': 'STVWYKK'}, {'start': '24', 'end': '32', 'sequence': 'ACDPRFGHI'}, {'start': '1', 'end': '23', 'sequence': 'ACDEFGHIKLMNPQRSTVWYKK'}] # list of digested fragment peptides.
>>> camc=Unimod.unimod.database.get_label('Carbamidomethyl')['delta_mono_mass'] # get delta mass for fixed cysteine modification.
>>> MSPlot.ms1pep.listmz(frags[0]['sequence'], charges=[2,3,4], modifications=[], fixedmods={'C': camc }) [908.43562931579208, 605.95969455552802, 454.72172717539604] # returns a list of mz values
>>> mzlist=MSPlot.ms1pep.listmz(frags[1]['sequence'], charges=[2,3,4], modifications=['2 Phospho (T)',], fixedmods={'C': camc }) >>> mzlist [536.7538243454826, 358.17182457532175, 268.8808246902413]
# listmz does not permute, so for variable modifications, call it with each permutation.
>>> xics=MSPlot.ms1pep.getXIC("LCMSrun.mzML", mzlist, tolerance=0.01, minrt=0, maxrt=None) >>> xics.keys() ['rt', '358.171824575', '536.753824345', '268.88082469'] # extracts the XIC for each mz ratio. The keys are a truncated string version of the full precision calculated mass.
MSPlot.ms1pep.plotXIC(xics, colours=[‘r’,’g’,’b’,’m’,’c’,’k’], outfile=”plot.pdf”, title=”XIC plot”): # plot the XICs. Any number can be plotted, but they are separated by 7% max(XIC)
Dependencies
pyMS - OpenMS data parsing and analysis by Niek de Klein pymzml - mzML parsing library pyteomics - package for calculating mz values and more numpy - large scale data handling matplotlib - plotting library EMBOSS (for the digest application) Unimod - module for handling a Unimod database.