Kegg:Reactions Scheme
The purpose of this workflow is to determine all the enzymes/genes that participate in a radius of 2 reaction steps around a given metabolite. Broadly, the scheme involves the following steps:
- determine all the reactions that the given metabolite participates in
- determine all the compounds that participate in these reactions
- filter certain compounds like H2O, ATP etc to avoid non-specific connections
- determine all the reactions that the compounds passing through step 3 participate in
- determine the enzymes that drive the reactions from step 4
- determine genes corresponding to the enzymes in step 5
- store the entrez gene ids as a text file
Preview
Run
Run this Workflow in the Taverna Workbench...
Option 1:
Copy and paste this link into File > 'Open workflow location...'
http://www.myexperiment.org/workflows/3107/download?version=2
[ More Info ]
Taverna is available from http://taverna.sourceforge.net/
If you are having problems downloading it in Taverna, you may need to provide your username and password in the URL so that Taverna can access the Workflow:
Replace http:// in the link above with http://yourusername:yourpassword@
Workflow Components
Harish Dharuri |
Kegg:Reactions Scheme |
The purpose of this workflow is to determine all the enzymes/genes that participate in a radius of 2 reaction steps around a given metabolite. Broadly, the scheme involves the following steps: 1. determine all the reactions that the given metabolite participates in 2. determine all the compounds that participate in these reactions 3. filter certain compounds like H2O, ATP etc to avoid non-specific connections 4. determine all the reactions that the compounds passing through step 3 participate in 5. determine the enzymes that drive the reactions from step 4 6. determine genes corresponding to the enzymes in step 5 7. store the entrez gene ids as a text file |
None
Name | Description |
---|---|
Compound_ID | The input for the workflow is a Kegg compound ID. cpd:C00062 |
path_to_output_file | This takes the input of the local directory where the user wants to store the output result of the workflow. |
Name | Type | Description |
---|---|---|
REST_Service | rest | |
extract_reactions_from_compound_file | beanshell |
Scriptimport java.util.regex.Matcher; import java.util.regex.Pattern; import java.util.ArrayList; import java.util.Iterator; String Reaction_Matcher; List out1 = new ArrayList(); List tmp = new ArrayList(); String p = "REACTION\\s+(.+?)\\b[A-Za-z]+?\\b"; Pattern pattern = Pattern.compile(p,Pattern.DOTALL); Matcher matcher = pattern.matcher(in1); while (matcher.find()){ Reaction_Matcher = matcher.group(1); } if (Reaction_Matcher != null) { //Now get all the compound ids from the "ENZYME" line String p1 = "(R[0-9]{5})"; Pattern pattern1 = Pattern.compile(p1,Pattern.MULTILINE); Matcher matcher1 = pattern1.matcher(Reaction_Matcher); while (matcher1.find()){ out1.add(matcher1.group(1)); } } else out1.add(0); |
Flatten_List | localworker |
Scriptflatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1); |
get_compounds_by_reaction | rest | |
extract_compounds_from_reaction_file | beanshell |
Scriptimport java.util.regex.Matcher; import java.util.regex.Pattern; import java.util.ArrayList; import java.util.Iterator; String EQUATION_Line; List out1 = new ArrayList(); //First grab the line that contains the word "EQUATION" String p = "EQUATION\\s+(.+)"; Pattern pattern = Pattern.compile(p,Pattern.MULTILINE); Matcher matcher = pattern.matcher(in1); while (matcher.find()){ EQUATION_Line = matcher.group(1); } //Now get all the compound ids from the "EQUATION" line String p1 = "(C[0-9]{5})"; Pattern pattern1 = Pattern.compile(p1,Pattern.MULTILINE); Matcher matcher1 = pattern1.matcher(EQUATION_Line); while (matcher1.find()){ out1.add(matcher1.group(1)); } |
query_for_reactions | beanshell |
Scriptout1="rn:" + in1; |
Flatten_List_2 | localworker |
Scriptflatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1); |
Remove_String_Duplicates | localworker |
ScriptList strippedlist = new ArrayList(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); if (strippedlist.contains(item) == false) { strippedlist.add(item); } } |
FILTER_COMPOUNDS | beanshell |
Certain compounds like H2O, ATP, ADP etc are removed from consideration for the next step to prevent non-specific or too general connections. The bean shell contains all the compounds with their Kegg Ids, the names of all these compounds are provided in the supplementary section of the publication. Scriptimport java.util.regex.Matcher; import java.util.regex.Pattern; import java.util.List; import java.util.ArrayList; List tmp = new ArrayList(); for(int i=0; i < in1.size(); i++) { if(!in1.get(i).toString().equals("C00001") && !in1.get(i).toString().equals("C00002") && !in1.get(i).toString().equals("C00003") && !in1.get(i).toString().equals("C00004") && !in1.get(i).toString().equals("C00005") && !in1.get(i).toString().equals("C00006") && !in1.get(i).toString().equals("C00007") && !in1.get(i).toString().equals("C00008") && !in1.get(i).toString().equals("C00009") && !in1.get(i).toString().equals("C00010") && !in1.get(i).toString().equals("C00011") && !in1.get(i).toString().equals("C00013") && !in1.get(i).toString().equals("C00014") && !in1.get(i).toString().equals("C00015") && !in1.get(i).toString().equals("C00016") && !in1.get(i).toString().equals("C00017") && !in1.get(i).toString().equals("C00019") && !in1.get(i).toString().equals("C00020") && !in1.get(i).toString().equals("C00024") && !in1.get(i).toString().equals("C00027") && !in1.get(i).toString().equals("C00028") && !in1.get(i).toString().equals("C00030") && !in1.get(i).toString().equals("C00033") && !in1.get(i).toString().equals("C00035") && !in1.get(i).toString().equals("C00040") && !in1.get(i).toString().equals("C00044") && !in1.get(i).toString().equals("C00046") && !in1.get(i).toString().equals("C00055") && !in1.get(i).toString().equals("C00063") && !in1.get(i).toString().equals("C00075") && !in1.get(i).toString().equals("C00080") && !in1.get(i).toString().equals("C00086") && !in1.get(i).toString().equals("C00105") && !in1.get(i).toString().equals("C00106") && !in1.get(i).toString().equals("C00112") && !in1.get(i).toString().equals("C00113") && !in1.get(i).toString().equals("C00125") && !in1.get(i).toString().equals("C00126") && !in1.get(i).toString().equals("C00131") && !in1.get(i).toString().equals("C00138") && !in1.get(i).toString().equals("C00139") && !in1.get(i).toString().equals("C00144") && !in1.get(i).toString().equals("C00147") && !in1.get(i).toString().equals("C00161") && !in1.get(i).toString().equals("C00162") && !in1.get(i).toString().equals("C00177") && !in1.get(i).toString().equals("C00212") && !in1.get(i).toString().equals("C00178") && !in1.get(i).toString().equals("C00206") && !in1.get(i).toString().equals("C00214") && !in1.get(i).toString().equals("C00239") && !in1.get(i).toString().equals("C00240") && !in1.get(i).toString().equals("C00242") && !in1.get(i).toString().equals("C00286") && !in1.get(i).toString().equals("C00288") && !in1.get(i).toString().equals("C00299") && !in1.get(i).toString().equals("C00330") && !in1.get(i).toString().equals("C00360") && !in1.get(i).toString().equals("C00361") && !in1.get(i).toString().equals("C00362") && !in1.get(i).toString().equals("C00363") && !in1.get(i).toString().equals("C00364") && !in1.get(i).toString().equals("C00365") && !in1.get(i).toString().equals("C00380") && !in1.get(i).toString().equals("C00387") && !in1.get(i).toString().equals("C00458") && !in1.get(i).toString().equals("C00459") && !in1.get(i).toString().equals("C00460") && !in1.get(i).toString().equals("C00475") && !in1.get(i).toString().equals("C00526") && !in1.get(i).toString().equals("C00533") && !in1.get(i).toString().equals("C00559") && !in1.get(i).toString().equals("C00575") && !in1.get(i).toString().equals("C00705") && !in1.get(i).toString().equals("C00725") && !in1.get(i).toString().equals("C00821") && !in1.get(i).toString().equals("C00856") && !in1.get(i).toString().equals("C00881") && !in1.get(i).toString().equals("C00941") && !in1.get(i).toString().equals("C00942") && !in1.get(i).toString().equals("C00943") && !in1.get(i).toString().equals("C00968") && !in1.get(i).toString().equals("C01346") && !in1.get(i).toString().equals("C01352") && !in1.get(i).toString().equals("C01764") && !in1.get(i).toString().equals("C01977") && !in1.get(i).toString().equals("C02353") && !in1.get(i).toString().equals("C02354") && !in1.get(i).toString().equals("C02355") && !in1.get(i).toString().equals("C02507") && !in1.get(i).toString().equals("C03110") && !in1.get(i).toString().equals("C03391") && !in1.get(i).toString().equals("C03446") && !in1.get(i).toString().equals("C03395") && !in1.get(i).toString().equals("C04152") && !in1.get(i).toString().equals("C04153") && !in1.get(i).toString().equals("C04154") && !in1.get(i).toString().equals("C04156") && !in1.get(i).toString().equals("C04157") && !in1.get(i).toString().equals("C04158") && !in1.get(i).toString().equals("C04159") && !in1.get(i).toString().equals("C04160") && !in1.get(i).toString().equals("C04268") && !in1.get(i).toString().equals("C04545") && !in1.get(i).toString().equals("C04728") && !in1.get(i).toString().equals("C04779") && !in1.get(i).toString().equals("C05167") && !in1.get(i).toString().equals("C05777") && !in1.get(i).toString().equals("C05924") && !in1.get(i).toString().equals("C06194") && !in1.get(i).toString().equals("C11378") && !in1.get(i).toString().equals("C15670") && !in1.get(i).toString().equals("C15672") && !in1.get(i).toString().equals("C15817") && !in1.get(i).toString().equals("C11478") && !in1.get(i).toString().equals("C17023") && !in1.get(i).toString().equals("C17324") && !in1.get(i).toString().equals("C19637")) { tmp.add(in1.get(i).toString()); } } result = tmp; |
query_for_compounds | beanshell |
Scriptout1="cpd:" + in1; |
get_reactions_by_compound | rest | |
extract_reactions_from_compound_file_2 | beanshell |
Scriptimport java.util.regex.Matcher; import java.util.regex.Pattern; import java.util.ArrayList; import java.util.Iterator; String Reaction_Matcher; List out1 = new ArrayList(); List tmp = new ArrayList(); String p = "REACTION\\s+(.+?)\\b[A-Za-z]+?\\b"; Pattern pattern = Pattern.compile(p,Pattern.DOTALL); Matcher matcher = pattern.matcher(in1); while (matcher.find()){ Reaction_Matcher = matcher.group(1); } if (Reaction_Matcher != null) { //Now get all the compound ids from the "ENZYME" line String p1 = "(R[0-9]{5})"; Pattern pattern1 = Pattern.compile(p1,Pattern.MULTILINE); Matcher matcher1 = pattern1.matcher(Reaction_Matcher); while (matcher1.find()){ out1.add(matcher1.group(1)); } } else out1.add(0); |
Flatten_List_3 | localworker |
Scriptflatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1); |
Flatten_List_4 | localworker |
Scriptflatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1); |
Remove_String_Duplicates_2 | localworker |
ScriptList strippedlist = new ArrayList(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); if (strippedlist.contains(item) == false) { strippedlist.add(item); } } |
query_for_reactions_2 | beanshell |
Scriptout1="rn:" + in1; |
get_enzymes_by_reaction | rest | |
extract_enzyme_from_reaction_file | beanshell |
Scriptimport java.util.regex.Matcher; import java.util.regex.Pattern; import java.util.ArrayList; import java.util.Iterator; String ENZYME_Line; List out1 = new ArrayList(); //First grab the line that contains the word "ENZYME" String p = "ENZYME\\s+([0-9.].+)"; Pattern pattern = Pattern.compile(p,Pattern.MULTILINE); Matcher matcher = pattern.matcher(in1); while (matcher.find()){ ENZYME_Line = matcher.group(1); } if (ENZYME_Line != null) { //Now get all the compound ids from the "ENZYME" line String p1 = "([0-9.]+$)"; Pattern pattern1 = Pattern.compile(p1,Pattern.MULTILINE); Matcher matcher1 = pattern1.matcher(ENZYME_Line); while (matcher1.find()){ out1.add(matcher1.group(1)); } } else out1.add(0); |
Flatten_List_5 | localworker |
Scriptflatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1); |
Concatenate_Files_2 | localworker |
ScriptBufferedReader getReader (String fileUrl) throws IOException { InputStreamReader reader; try { reader = new FileReader(fileUrl); } catch (FileNotFoundException e) { // try a real URL instead URL url = new URL(fileUrl); reader = new InputStreamReader (url.openStream()); } return new BufferedReader(reader); } String NEWLINE = System.getProperty("line.separator"); boolean displayResults = false; if (displayresults != void) { displayResults = Boolean.valueOf(displayresults).booleanValue(); } StringBuffer sb = new StringBuffer(2000); if (outputfile == void) { throw new RuntimeException("The 'outputfile' parameter cannot be null"); } if (filelist == null) { throw new RuntimeException("The 'filelist' parameter cannot be null"); } String str = null; Writer writer = new FileWriter(outputfile); for (int i = 0; i < filelist.size(); i++) { BufferedReader reader = getReader(filelist.get(i)); while ((str = reader.readLine()) != null) { writer.write(str); writer.write(NEWLINE); if (displayResults) { sb.append(str); sb.append(NEWLINE); } } reader.close(); } writer.flush(); writer.close(); if (displayResults) { results= sb.toString(); } |
query_enzyme | rest | |
extract_gene_from_enzyme_file | beanshell |
Scriptimport java.util.regex.Matcher; import java.util.regex.Pattern; import java.util.ArrayList; import java.util.Iterator; String HSA_Line; List out1 = new ArrayList(); //First grab the line that contains the word "HSA:" String p = "\\s+HSA:\\s+(.+)"; Pattern pattern = Pattern.compile(p,Pattern.MULTILINE); Matcher matcher = pattern.matcher(in1); while (matcher.find()){ HSA_Line = matcher.group(1); } //Split the genes, each element will now have the gene id and the gene name in brackets if (HSA_Line != null) { String [] line = HSA_Line.split(" "); //Extract the gene id, remove the gene name from consideration for (int i=0; i < line.length; i++) { String [] line1 = line[i].split("\\("); out1.add(line1[0]); } } else { out1.add(""); } |
Remove_String_Duplicates_3 | localworker |
ScriptList strippedlist = new ArrayList(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); if (strippedlist.contains(item) == false) { strippedlist.add(item); } } |
Flatten_List_6 | localworker |
Scriptflatten(inputs, outputs, depth) { for (i = inputs.iterator(); i.hasNext();) { element = i.next(); if (element instanceof Collection && depth > 0) { flatten(element, outputs, depth - 1); } else { outputs.add(element); } } } outputlist = new ArrayList(); flatten(inputlist, outputlist, 1); |
Remove_String_Duplicates_3_2 | localworker |
ScriptList strippedlist = new ArrayList(); for (Iterator i = stringlist.iterator(); i.hasNext();) { String item = (String) i.next(); if (strippedlist.contains(item) == false) { strippedlist.add(item); } } |
Create_and_populate_temporary_file_2 | beanshell |
ScriptFile f = File.createTempFile("taverna", ".tmp"); BufferedWriter writer = new BufferedWriter(new FileWriter(f)); writer.write(content); writer.close(); filepath = f.getCanonicalPath(); |
query_for_enzyme | beanshell |
Scriptout1="ec:"+in1; |
Name | Description | Inputs | Outputs |
---|---|---|---|
extract_reactions_from_compound_file | in1 | out1 | |
extract_compounds_from_reaction_file | in1 | out1 | |
query_for_reactions | in1 | out1 | |
FILTER_COMPOUNDS | Certain compounds like H2O, ATP, ADP etc are removed from consideration for the next step to prevent non-specific or too general connections. The bean shell contains all the compounds with their Kegg Ids, the names of all these compounds are provided in the supplementary section of the publication. | in1 | result |
query_for_compounds | in1 | out1 | |
extract_reactions_from_compound_file_2 | in1 | out1 | |
query_for_reactions_2 | in1 | out1 | |
extract_enzyme_from_reaction_file | in1 | out1 | |
extract_gene_from_enzyme_file | in1 | out1 | |
Create_and_populate_temporary_file_2 | content | filepath | |
query_for_enzyme | in1 | out1 |
None
Source | Sink |
---|---|
Compound_ID | REST_Service:query |
REST_Service:responseBody | extract_reactions_from_compound_file:in1 |
extract_reactions_from_compound_file:out1 | Flatten_List:inputlist |
query_for_reactions:out1 | get_compounds_by_reaction:query |
get_compounds_by_reaction:responseBody | extract_compounds_from_reaction_file:in1 |
Flatten_List:outputlist | query_for_reactions:in1 |
extract_compounds_from_reaction_file:out1 | Flatten_List_2:inputlist |
Flatten_List_2:outputlist | Remove_String_Duplicates:stringlist |
Remove_String_Duplicates:strippedlist | FILTER_COMPOUNDS:in1 |
FILTER_COMPOUNDS:result | query_for_compounds:in1 |
query_for_compounds:out1 | get_reactions_by_compound:query |
get_reactions_by_compound:responseBody | extract_reactions_from_compound_file_2:in1 |
extract_reactions_from_compound_file_2:out1 | Flatten_List_3:inputlist |
Flatten_List_3:outputlist | Flatten_List_4:inputlist |
Flatten_List_4:outputlist | Remove_String_Duplicates_2:stringlist |
Remove_String_Duplicates_2:strippedlist | query_for_reactions_2:in1 |
query_for_reactions_2:out1 | get_enzymes_by_reaction:query |
get_enzymes_by_reaction:responseBody | extract_enzyme_from_reaction_file:in1 |
extract_enzyme_from_reaction_file:out1 | Flatten_List_5:inputlist |
Create_and_populate_temporary_file_2:filepath | Concatenate_Files_2:filelist |
path_to_output_file | Concatenate_Files_2:outputfile |
query_for_enzyme:out1 | query_enzyme:query |
query_enzyme:responseBody | extract_gene_from_enzyme_file:in1 |
Flatten_List_5:outputlist | Remove_String_Duplicates_3:stringlist |
extract_gene_from_enzyme_file:out1 | Flatten_List_6:inputlist |
Flatten_List_6:outputlist | Remove_String_Duplicates_3_2:stringlist |
Remove_String_Duplicates_3_2:strippedlist | Create_and_populate_temporary_file_2:content |
Remove_String_Duplicates_3:strippedlist | query_for_enzyme:in1 |
None
Workflow Type
Version 2 (latest) (of 2)
Shared with Groups (1)
Log in to add to one of your Packs
Statistics
In chronological order:
-
Created by Harish Dharuri on Monday 20 August 2012 22:09:15 (UTC)
-
Created by Harish Dharuri on Tuesday 27 August 2013 08:06:54 (UTC)
Reviews (0)
Other workflows that use similar services (0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment