Comparison of Peptide and Protein Fractionation Methods

Created: 2013-03-27 10:04:23

Download Workflow

This workflow was used to analyze the data in a manuscript by Mostovenko et al. (2013, submitted), comparing peptide and protein fractionation methods. The workflow identifies proteins by X!Tandem database search and validates the results using PeptideProphet. Additional information such as pI and fraction number is extracted and plotted for IEF and SCX data. For each protein identified in SDS-PAGE derived data sequences are downloaded from UniProt and plotted against the fraction number. Rshell script produces the figures essentially as they appear in the manuscript.

Executing the workflow requires Rserve running and the Trans-Proteomic Pipeline (http://tools.proteomecenter.org/wiki/index.php?title=Software:TPP) installed with default settings and in the default location. The Rshell script contains the location where the figures will be generated. All other output files are stored in the input data folders by default. The version of X!Tandem called from this workflow is separate from the one installed with the Trans-Proteomic Pipeline (the location specified in the Tandem components).

Preview

Download as scalable diagram (SVG)

Run

Run this Workflow in the Taverna Workbench...

Option 1:

Copy and paste this link into File > 'Open workflow location...'
http://myexperiment.org/workflows/3486/download?version=1
[ More Info Expand ]

Workflow Components

Authors (1)

Titles (1)

Descriptions (1)

Dependencies (0)

Inputs (9)

Name	Description
IEF_Dir
IEF_Results_Dir
SCX_Dir
SCX_Results_Dir
PAGE_Dir
PAGE_Results_Dir
Tandem_Param_File
FASTA_File
Prophet_Params

Processors (21)

Name	Type	Description
Read_Text_File_PAGE	localworker	Script BufferedReader getReader (String fileUrl, String encoding) throws IOException { InputStreamReader reader; try { if (encoding == null) { reader = new FileReader(fileUrl); } else { reader = new InputStreamReader(new FileInputStream(fileUrl),encoding); } } catch (FileNotFoundException e) { // try a real URL instead URL url = new URL(fileUrl); if (encoding == null) { reader = new InputStreamReader (url.openStream()); } else { reader = new InputStreamReader (url.openStream(), encoding); } } return new BufferedReader(reader); } StringBuffer sb = new StringBuffer(4000); if (encoding == void) { encoding = null; } BufferedReader in = getReader(fileurl, encoding); String str; String lineEnding = System.getProperty("line.separator"); while ((str = in.readLine()) != null) { sb.append(str); sb.append(lineEnding); } in.close(); filecontents = sb.toString();
Extract_Spectrum_Number_PAGE	xpath	Xpath Expression /default:msms_pipeline_analysis/default:msms_run_summary/default:spectrum_query/@spectrum
Extract_Protein_Names	xpath	Xpath Expression /default:msms_pipeline_analysis/default:msms_run_summary/default:spectrum_query/default:search_result/default:search_hit/@protein
Make_Well_Number_List_PAGE	beanshell	The JavaBean to convert the spectrum information to the fraction number Script int s = well.size(); String[] temp = new String[s]; String wellnum = new String(); int num1 = 0; int num; String num2 = new String(); for (i=0; i
Get_Fasta	beanshell	The JavaBean to extract protein sequences from the UniProt Database. Script int s = protein.size(); String [] out1 = new String[s]; String [] out2 = new String[s]; String [] id = new String[3]; String [] seqsh = new String [2]; int MW = 0; for (i=0; i
Join_and_Insert_Tabs_PAGE	beanshell	Script int s = in1.size(); String out = new String(); for (i=0; i
Get_MW	beanshell	This JavaBean component calculates protein molecular weights based on their sequences. Script int s = seq.size(); String[] seqsh = new String[2]; String seqsh2 = new String(); double[] outMW = new double[s]; for (i=0; i[^\n]+\n"); seqsh2 = seqsh[1].replace("\n", ""); seqsh2 = seqsh2.replace("[^A-Z]", ""); String rep = new String(); rep = seqsh2.replace("A", "71.04 "); rep = rep.replace("C", "103.01 "); rep = rep.replace("D", "115.03 "); rep = rep.replace("E", "129.04 "); rep = rep.replace("F", "147.07 "); rep = rep.replace("G", "57.02 "); rep = rep.replace("H", "137.06 "); rep = rep.replace("I", "113.08 "); rep = rep.replace("K", "128.09 "); rep = rep.replace("L", "113.08 "); rep = rep.replace("M", "131.04 "); rep = rep.replace("N", "114.04 "); rep = rep.replace("P", "97.05 "); rep = rep.replace("Q", "128.06 "); rep = rep.replace("R", "156.10 "); rep = rep.replace("S", "87.03 "); rep = rep.replace("T", "101.05 "); rep = rep.replace("V", "99.07 "); rep = rep.replace("W", "186.08 "); rep = rep.replace("Y", "163.06 "); rep = rep.replace("X", "0.0 "); rep = rep.replace("U", "168.96 "); seqnum = rep.split(" ", 0); for (int k=0; k 0 && !seqnum[k].equals("") ){ MW = MW + Double.parseDouble(seqnum[k]); } else MW = MW; } } else MW = 0; outMW[i] = MW/1000; } out = outMW;
MW_Plot	rshell	The Rshell script to plot the calculated protein molecular weight against its localization on the gel (fraction number). Script MW <- read.table(MassList, sep="\t"); l <- nrow(MW); MW.m <- as.matrix(MW, nrow=l, ncol=2); y <- max(MW.m[,2]); ax <- c(1:24); png("MW_plot.png"); plot (MW.m[,1],MW.m[,2], xlim = c(1,24),ylim=NULL, col=1,xlab="Fraction Number", ylab="Calculated Mass (kDa)",pch=10, cex=1, cex.lab=2, mar=c(5,20,5,5)) axis (1, at=ax) dev.off() R Server localhost:6311
Database_Search	workflow	For each dataset this nested workflow converts raw data to generic .mzXML format and passes it on to X!Tandem for search. Resulting files are converted to .pepXML and the probabilities are assigned to peptide-spectrum matches with PeptideProphet.
Read_Text_File_IEF	localworker	Script BufferedReader getReader (String fileUrl, String encoding) throws IOException { InputStreamReader reader; try { if (encoding == null) { reader = new FileReader(fileUrl); } else { reader = new InputStreamReader(new FileInputStream(fileUrl),encoding); } } catch (FileNotFoundException e) { // try a real URL instead URL url = new URL(fileUrl); if (encoding == null) { reader = new InputStreamReader (url.openStream()); } else { reader = new InputStreamReader (url.openStream(), encoding); } } return new BufferedReader(reader); } StringBuffer sb = new StringBuffer(4000); if (encoding == void) { encoding = null; } BufferedReader in = getReader(fileurl, encoding); String str; String lineEnding = System.getProperty("line.separator"); while ((str = in.readLine()) != null) { sb.append(str); sb.append(lineEnding); } in.close(); filecontents = sb.toString();
Extract_Parameters_IEF	xpath	Xpath Expression /default:msms_pipeline_analysis/default:msms_run_summary/default:spectrum_query/default:search_result/default:search_hit/default:analysis_result/default:peptideprophet_result/default:search_score_summary/default:parameter[5]/@value
Join_and_Insert_Tabs_IEF	beanshell	Script int s = well.size(); String out = new String(); for (i=0; i
Extract_Spectrum_Number_IEF	xpath	Xpath Expression /default:msms_pipeline_analysis/default:msms_run_summary/default:spectrum_query/@spectrum
Make_Well_Number_List_IEF	beanshell	The JavaBean to convert the spectrum information to the fraction number Script int s = well.size(); String[] temp = new String[s]; String wellnum = new String(); int num1 = 0; int num; String num2 = new String(); for (i=0; i
pI_Plot_IEF	rshell	This Rshell uses the extracted from pepXML files pI and spectrum information to generate the pI distribution plot for the IEF fractionation derived data. Script pI <- read.table(pIList, sep="\t"); l <- nrow(pI); pI.m <- as.matrix(pI, nrow=l, ncol=2); y <- max(pI.m[,2]); ax <- c(1:24); png("pI_Plot_IEF.png"); plot (pI.m[,1],pI.m[,2], xlim = c(1,24),ylim=NULL, col=1,xlab="Fraction Number", ylab="pI",pch=10, cex=1, cex.lab=2, mar=c(5,20,5,5)) axis (1, at=ax) dev.off() R Server localhost:6311
pI_Plot_SCX	rshell	This Rshell uses the extracted from pepXML files pI and spectrum information to generate the pI distribution plot for the SCX fractionation derived data. Script pI <- read.table(pIList, sep="\t"); l <- nrow(pI); pI.m <- as.matrix(pI, nrow=l, ncol=2); y <- max(pI.m[,2]); ax <- c(1:24); png("pI_plot.png"); plot (pI.m[,1],pI.m[,2], xlim = c(1,24),ylim=NULL, col=1,xlab="Fraction Number", ylab="Calculated Mass (kDa)",pch=10, cex=1, cex.lab=2, mar=c(5,20,5,5)) axis (1, at=ax) dev.off() R Server localhost:6311
Read_Text_File_SCX	localworker	Script BufferedReader getReader (String fileUrl, String encoding) throws IOException { InputStreamReader reader; try { if (encoding == null) { reader = new FileReader(fileUrl); } else { reader = new InputStreamReader(new FileInputStream(fileUrl),encoding); } } catch (FileNotFoundException e) { // try a real URL instead URL url = new URL(fileUrl); if (encoding == null) { reader = new InputStreamReader (url.openStream()); } else { reader = new InputStreamReader (url.openStream(), encoding); } } return new BufferedReader(reader); } StringBuffer sb = new StringBuffer(4000); if (encoding == void) { encoding = null; } BufferedReader in = getReader(fileurl, encoding); String str; String lineEnding = System.getProperty("line.separator"); while ((str = in.readLine()) != null) { sb.append(str); sb.append(lineEnding); } in.close(); filecontents = sb.toString();
Extract_Parameters_SCX	xpath	Xpath Expression /default:msms_pipeline_analysis/default:msms_run_summary/default:spectrum_query/default:search_result/default:search_hit/default:analysis_result/default:peptideprophet_result/default:search_score_summary/default:parameter[5]/@value
Extract_Spectrum_Number_SCX	xpath	Xpath Expression /default:msms_pipeline_analysis/default:msms_run_summary/default:spectrum_query/@spectrum
Make_Well_Number_List_SCX	beanshell	The JavaBean to convert the spectrum information to the fraction number Script int s = well.size(); String[] temp = new String[s]; String wellnum = new String(); int num1 = 0; int num; String num2 = new String(); for (i=0; i
Join_and_Insert_Tabs_SCX	beanshell	Script int s = well.size(); String out = new String(); for (i=0; i

Beanshells (17)

Name	Description	Inputs	Outputs
Make_Well_Number_List_PAGE	The JavaBean to convert the spectrum information to the fraction number	well	out
Get_Fasta	The JavaBean to extract protein sequences from the UniProt Database.	protein	seq ID
Join_and_Insert_Tabs_PAGE		in1 in2	out1
Get_MW	This JavaBean component calculates protein molecular weights based on their sequences.	seq	out
Join_and_Insert_Tabs_IEF		well pI	out1
Make_Well_Number_List_IEF	The JavaBean to convert the spectrum information to the fraction number	well	out
Make_Well_Number_List_SCX	The JavaBean to convert the spectrum information to the fraction number	well	out
Join_and_Insert_Tabs_SCX		well pI	out1
Tandem_SCX		mzxml_file fasta_file parameter_file	tandem_file
Tandem_PAGE		mzxml_file fasta_file parameter_file	tandem_file
Tandem_IEF		mzxml_file fasta_file parameter_file	tandem_file
SCX_CompassXport		raw_data_dir raw_flag result_dir	output_file
PAGE_CompassXport		raw_data_dir raw_flag result_dir	output_file
IEF_CompassXport		raw_data_dir raw_flag result_dir	output_file
PeptideProphet_PAGE		input_files arguments	output_file
PeptideProphet_IEF		input_files arguments	output_file
PeptideProphet_SCX		input_files arguments	output_file

Outputs (3)

Name	Description
MW_Plot
pI_Plot_IEF
pI_Plot_SCX

Datalinks (35)

Source	Sink
Database_Search:PAGE_pepXML_Output	Read_Text_File_PAGE:fileurl
Read_Text_File_PAGE:filecontents	Extract_Spectrum_Number_PAGE:xml_text
Read_Text_File_PAGE:filecontents	Extract_Protein_Names:xml_text
Extract_Spectrum_Number_PAGE:nodelist	Make_Well_Number_List_PAGE:well
Extract_Protein_Names:nodelist	Get_Fasta:protein
Make_Well_Number_List_PAGE:out	Join_and_Insert_Tabs_PAGE:in1
Get_MW:out	Join_and_Insert_Tabs_PAGE:in2
Get_Fasta:seq	Get_MW:seq
Join_and_Insert_Tabs_PAGE:out1	MW_Plot:MassList
Prophet_Params	Database_Search:Prophet_Params
FASTA_File	Database_Search:FASTA_File
Tandem_Param_File	Database_Search:Tandem_Param_File
PAGE_Results_Dir	Database_Search:PAGE_Results_Dir
PAGE_Dir	Database_Search:PAGE_Dir
SCX_Results_Dir	Database_Search:SCX_Results_Dir
SCX_Dir	Database_Search:SCX_Dir
IEF_Dir	Database_Search:IEF_Dir
IEF_Results_Dir	Database_Search:IEF_Results_Dir
Database_Search:IEF_pepXML_Output	Read_Text_File_IEF:fileurl
Read_Text_File_IEF:filecontents	Extract_Parameters_IEF:xml_text
Make_Well_Number_List_IEF:out	Join_and_Insert_Tabs_IEF:well
Extract_Parameters_IEF:nodelist	Join_and_Insert_Tabs_IEF:pI
Read_Text_File_IEF:filecontents	Extract_Spectrum_Number_IEF:xml_text
Extract_Spectrum_Number_IEF:nodelist	Make_Well_Number_List_IEF:well
Join_and_Insert_Tabs_IEF:out1	pI_Plot_IEF:pIList
Join_and_Insert_Tabs_SCX:out1	pI_Plot_SCX:pIList
Database_Search:SCX_pepXML_Output	Read_Text_File_SCX:fileurl
Read_Text_File_SCX:filecontents	Extract_Parameters_SCX:xml_text
Read_Text_File_SCX:filecontents	Extract_Spectrum_Number_SCX:xml_text
Extract_Spectrum_Number_SCX:nodelist	Make_Well_Number_List_SCX:well
Make_Well_Number_List_SCX:out	Join_and_Insert_Tabs_SCX:well
Extract_Parameters_SCX:nodelist	Join_and_Insert_Tabs_SCX:pI
MW_Plot:MW_plot	MW_Plot
pI_Plot_IEF:pI_Plot_IEF	pI_Plot_IEF
pI_Plot_SCX:pI_plot	pI_Plot_SCX