This workflow uses one or more services that
are deprecated as of 31st December 2012
(almost 12 years ago), and may no longer function.
Show details...
Affected service WSDL:
- http://soap.genome.jp/KEGG.wsdl
Details:
KEGG will be moving from a WSDL/SOAP interface to REST. Details of the new REST services can be found here.
Working examples that use the new REST service can be viewed here, here and here.
Create_SNP_Set
Created: 2012-08-21 01:50:23
The purpose of the workflow is to determine SNPs in the vicinity of the genes and create a SNP set for a given set of genes. The user has the freedom to choose the flanking width around the gene for determining the SNPs. The input is in the form of entrez gene ids. Biomart services are used to determine the chromosome and position of the gene as well as determining Affy gene chip 6k ids. The final report is stored as a tab-delimited text file with Affy 6 gene chip ids for the SNP and Kegg info for the gene that it is associated with.
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Titles (1)
Descriptions (1)
The purpose of the workflow is to determine SNPs in the vicinity of the genes and create a SNP set for a given set of genes. The user has the freedom to choose the flanking width around the gene for determining the SNPs. The input is in the form of entrez gene ids. Biomart services are used to determine the chromosome and position of the gene as well as determining Affy gene chip 6k ids. The final report is stored as a tab-delimited text file with Affy 6 gene chip ids for the SNP and Kegg info for the gene that it is associated with. |
Dependencies (0)
Inputs (3)
Name |
Description |
Entrez_ID |
This takes an input of a list of genes in the form of entrez gene ids.
|
set_width |
The allows the user to set the flanking width for the gene for determining SNPs
|
path_to_output_file |
This takes the input of the local directory where the user wants to store the output result of the workflow.
|
Processors (12)
Name |
Type |
Description |
Split_string_into_string_list_by_regular_expression |
localworker |
This service splits the string value into a string list at every occurrence of the specified regular expression. The regular expression provided in this case is a new line: "\n" ScriptList split = new ArrayList();
if (!string.equals("")) {
String regexString = ",";
if (regex != void) {
regexString = regex;
}
String[] result = string.split(regexString);
for (int i = 0; i < result.length; i++) {
split.add(result[i]);
}
}
|
regex_value |
stringconstant |
Value\n |
hsapiens_gene_ensembl |
biomart |
This is a Biomart service that takes the entrez gene id as an input and returns the chromosome name and start and end position of the gene. |
Set_width |
beanshell |
The flanking region around the gene is calculated in this beanshell. The input is provided by the user. Scriptimport java.util.*;
List tmp_end = new ArrayList();
List tmp_start = new ArrayList();
width = Integer.parseInt(in1.get(0));
int value_end=0;
int value_start=0;
int out_end=0;
int out_start=0;
for(int i=0; i < end_in.size(); i++) {
value_end = Integer.parseInt(end_in.get(i));
value_start = Integer.parseInt(start_in.get(i));
if (value_start < value_end ) {
out_end = value_end + width;
out_start = value_start - width;
}
else {
out_start = value_start + width;
out_end = value_end - width;
}
tmp_end.add(out_end);
tmp_start.add(out_start);
}
end_out = tmp_end;
start_out = tmp_start;
chr_out = chr_in; |
Convert_to_Kegg_id |
beanshell |
The input to btit (which provides gene information from the kegg database) is a Kegg gene id. To convert entrez gene ids to Kegg ids "hsa:" is concatenated to the entrez gene id. |
btit |
wsdl |
The input for this service is Kegg gene id in the form of (for example) hsa:1234. It returns the gene name and definitions of the given entry id. Wsdlhttp://soap.genome.jp/KEGG.wsdlWsdl Operationbtit |
hsapiens_snp |
biomart |
This is a Biomart service that takes the chromosome name, and start and end positions and finds all the Affy genechip 6k SNP ids present in the region. The Affy6 ids can easily be changed to any other desired gene chip id like Illumina chip ids or other version of the Affymetrix gene chip. |
Flatten_List_2 |
localworker |
This service flattens the inputlist by one level. It returns the result of the flattening. Scriptflatten(inputs, outputs, depth) {
for (i = inputs.iterator(); i.hasNext();) {
element = i.next();
if (element instanceof Collection && depth > 0) {
flatten(element, outputs, depth - 1);
} else {
outputs.add(element);
}
}
}
outputlist = new ArrayList();
flatten(inputlist, outputlist, 1); |
Create_Final_Report |
beanshell |
This aligns the SNP ID with the associated gene id and gene information in a tab-delimited format. Scriptimport java.util.List;
import java.util.ArrayList;
List tmp = new ArrayList();
for(int i=0; i < in2.size(); i++) {
if(!in2.get(i).toString().equals("")) {
tmp.add(in2.get(i).toString() + "\t" + in1);
}
}
result = tmp;
|
Flatten_List |
localworker |
This service flattens the inputlist by one level. It returns the result of the flattening. Scriptflatten(inputs, outputs, depth) {
for (i = inputs.iterator(); i.hasNext();) {
element = i.next();
if (element instanceof Collection && depth > 0) {
flatten(element, outputs, depth - 1);
} else {
outputs.add(element);
}
}
}
outputlist = new ArrayList();
flatten(inputlist, outputlist, 1); |
Create_and_populate_temporary_file_2 |
beanshell |
This service creates a temporary file in a local tmp directory. ScriptFile f = File.createTempFile("taverna", ".tmp");
BufferedWriter writer = new BufferedWriter(new FileWriter(f));
writer.write(content);
writer.close();
filepath = f.getCanonicalPath(); |
Concatenate_Files_2 |
localworker |
This service examines the files whose paths or URLs are specified in the filelist. The content of those files is concatenated. ScriptBufferedReader getReader (String fileUrl) throws IOException {
InputStreamReader reader;
try {
reader = new FileReader(fileUrl);
}
catch (FileNotFoundException e) {
// try a real URL instead
URL url = new URL(fileUrl);
reader = new InputStreamReader (url.openStream());
}
return new BufferedReader(reader);
}
String NEWLINE = System.getProperty("line.separator");
boolean displayResults = false;
if (displayresults != void) {
displayResults = Boolean.valueOf(displayresults).booleanValue();
}
StringBuffer sb = new StringBuffer(2000);
if (outputfile == void) {
throw new RuntimeException("The 'outputfile' parameter cannot be null");
}
if (filelist == null) {
throw new RuntimeException("The 'filelist' parameter cannot be null");
}
String str = null;
Writer writer = new FileWriter(outputfile);
for (int i = 0; i < filelist.size(); i++) {
BufferedReader reader = getReader(filelist.get(i));
while ((str = reader.readLine()) != null) {
writer.write(str);
writer.write(NEWLINE);
if (displayResults) {
sb.append(str);
sb.append(NEWLINE);
}
}
reader.close();
}
writer.flush();
writer.close();
if (displayResults) {
results= sb.toString();
}
|
Beanshells (4)
Name |
Description |
Inputs |
Outputs |
Set_width |
The flanking region around the gene is calculated in this beanshell. The input is provided by the user. |
chr_in
end_in
start_in
in1
|
chr_out
end_out
start_out
|
Convert_to_Kegg_id |
The input to btit (which provides gene information from the kegg database) is a Kegg gene id. To convert entrez gene ids to Kegg ids "hsa:" is concatenated to the entrez gene id. |
in1
|
out1
|
Create_Final_Report |
This aligns the SNP ID with the associated gene id and gene information in a tab-delimited format. |
in1
in2
|
result
|
Create_and_populate_temporary_file_2 |
This service creates a temporary file in a local tmp directory. |
content
|
filepath
|
Datalinks (19)
Source |
Sink |
Entrez_ID |
Split_string_into_string_list_by_regular_expression:string |
regex_value:value |
Split_string_into_string_list_by_regular_expression:regex |
Split_string_into_string_list_by_regular_expression:split |
hsapiens_gene_ensembl:hsapiens_gene_ensembl.entrezgene_filter |
hsapiens_gene_ensembl:hsapiens_gene_ensembl.chromosome_name |
Set_width:chr_in |
hsapiens_gene_ensembl:hsapiens_gene_ensembl.end_position |
Set_width:end_in |
hsapiens_gene_ensembl:hsapiens_gene_ensembl.start_position |
Set_width:start_in |
set_width |
Set_width:in1 |
Split_string_into_string_list_by_regular_expression:split |
Convert_to_Kegg_id:in1 |
Convert_to_Kegg_id:out1 |
btit:string |
Set_width:chr_out |
hsapiens_snp:hsapiens_snp.chr_name_filter |
Set_width:end_out |
hsapiens_snp:hsapiens_snp.chrom_end_filter |
Set_width:start_out |
hsapiens_snp:hsapiens_snp.chrom_start_filter |
btit:return |
Flatten_List_2:inputlist |
Flatten_List_2:outputlist |
Create_Final_Report:in1 |
hsapiens_snp:hsapiens_snp.affy6 |
Create_Final_Report:in2 |
Create_Final_Report:result |
Flatten_List:inputlist |
Flatten_List:outputlist |
Create_and_populate_temporary_file_2:content |
Create_and_populate_temporary_file_2:filepath |
Concatenate_Files_2:filelist |
path_to_output_file |
Concatenate_Files_2:outputfile |
Uploader
License
All versions of this Workflow are
licensed under:
Version 1
(of 1)
Credits (1)
(People/Groups)
Attributions (0)
(Workflows/Files)
None
Shared with Groups (1)
Featured In Packs (1)
Log in to add to one of your Packs
Attributed By (0)
(Workflows/Files)
None
Favourited By (0)
No one
Statistics
Other workflows that use similar services
(93)
Only the first 2 workflows that use similar services are shown. View all workflows that use these services.
NCBI Gi to Kegg Pathways
(1)
"This workflow gets a series of information relating to a list of KEGG genes supplied to it. It also removes any null values from a list of strings."This workflow gets a series of information relating to a list of KEGG genes supplied to it. It also removes any null values from a list of strings.
Created: 2011-03-28
| Last updated: 2011-03-28
Credits:
Alibukhari
Cow-Human Ortholog Pathways and Gene annot...
(2)
This workflow searches for genes which reside in a QTL (Quantitative Trait Loci) region in the cow, Bos taurus. The workflow requires an input of: a chromosome name or number; a QTL start base pair position; QTL end base pair position. Data is then extracted from BioMart to annotate each of the genes found in this region. As the Cow genome is currently unfinished, the workflow subsequently maps the cow ensembl gene ids to human orthologues. Entrez and UniProt identifiers are then identified...
Created: 2007-10-03
| Last updated: 2009-12-03
Comments (0)
No comments yet
Log in to make a comment