Blast_Align_and_Tree
Created: 2013-01-28 08:50:21
Last updated: 2013-01-30 12:18:08
This workflow accepts a protein sequence as input. This sequence is compared to others in the Uniprot database, using the NCBI BLAST Web Service from the EBI (WSDL), and the top 10 hits are returned (Nested workflow:EBI_NCBI_BLast).
For each extracted hit, the Uniprot REST service returns the protein sequence in FASTA format. The workflow concatenates the 10 protein sequences and submits them as input to the EBI CLustalw service (Nested workflow EMBL_EBI_clustalw2_SOAP). These sequences are aligned and returned as results.
Finally, the alignment is submitted to the EBI Clustalw_phylogeny service (Nested Workflow: clustalw_phylogeny), and a phylogenetic tree in phylip format is retuned.
The workflow returned a list of protein sequences in FASTA format, a Clustalw alignment, and a phylogenetic tree.
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Titles (1)
Descriptions (1)
This workflow accepts a protein sequence as input. This sequence is compared to others in the Uniprot database, using the NCBI BLAST Web Service from the EBI (WSDL), and the top 10 hits are returned (Nested workflow:EBI_NCBI_BLast).
For each extracted hit, the Uniprot REST service returns the protein sequence in FASTA format. The workflow concatenates the 10 protein sequences and submits them as input to the EBI CLustalw service (Nested workflow EMBL_EBI_clustalw2_SOAP). These sequences are aligned and returned as results.
Finally, the alignment is submitted to the EBI Clustalw_phylogeny service (Nested Workflow: clustalw_phylogeny), and a phylogenetic tree in phylip format is retuned.
The workflow returned a list of protein sequences in FASTA format, a Clustalw alignment, and a phylogenetic tree.
|
Dependencies (0)
Inputs (2)
Name |
Description |
emailAddress |
Requires a valid email address in order to execute services hosted at the EBI.
The EBI asks for an email address so that they can contact you about:
Problems with the service which affect your jobs.
Scheduled maintenance which affects services you are using.
Deprecation and retirement of a service you are using.
If you use a fake email, the workflow may be cancelled before execution
|
Sequence |
Accepts a protein sequence in FASTA format
|
Processors (10)
Name |
Type |
Description |
Filter_IDs |
localworker |
Scriptimport java.util.regex.*;
filteredlist = new ArrayList();
Pattern thePat = Pattern.compile(regex);
int theGroup = Integer.parseInt(group);
for (Iterator i = stringlist.iterator(); i.hasNext();) {
String item = (String) i.next();
Matcher matcher = thePat.matcher(item);
if (matcher.find()) {
filteredlist.add(matcher.group(theGroup));
}
}
|
matchGroup |
stringconstant |
Value0 |
MatchPattern |
stringconstant |
Value[A-Z_0-9]{4,}_*. |
Split_IDs |
localworker |
ScriptList split = new ArrayList();
if (!string.equals("")) {
String regexString = ",";
if (regex != void) {
regexString = regex;
}
String[] result = string.split(regexString);
for (int i = 0; i < result.length; i++) {
split.add(result[i]);
}
}
|
regex |
stringconstant |
Value\n |
REST_UniProt |
rest |
|
Merge_FastaSeqs |
localworker |
ScriptString seperatorString = "\n";
if (seperator != void) {
seperatorString = seperator;
}
StringBuffer sb = new StringBuffer();
for (Iterator i = stringlist.iterator(); i.hasNext();) {
String item = (String) i.next();
sb.append(item);
if (i.hasNext()) {
sb.append(seperatorString);
}
}
concatenated = sb.toString();
|
EMBL_EBI_ClustalW2_SOAP |
workflow |
Perform a ClustalW2 alignment of protein sequences using the EMBL-EBI’s ClustalW2 (SOAP) service (see http://www.ebi.ac.uk/Tools/webservices/services/msa/clustalw2_soap).
This workflow uses the new EBI services, which are asynchronous and require looping over the nested workflow (Status) until the workflow has finished. Many of the EBI services now work in this way, so you can use this workflow as an example of the invocation pattern and looping configuration. |
EBI_NCBI_BLAST |
workflow |
This workflow performs an NCBI blast at the EBI. It accepts a protein sequence as input. Default values have been set for the search database (Uniprot), the number of hite to return (10), and all scoring and matrix options. These can be changed in the workflow by altering the string constant values if required.
This workflow uses the new EBI services. They are asynchronous and so require looping over the nested workflow (Status) until the workflow has finished. Many of the EBI services now work in this way, so you can use this workflow as an example of the invocation pattern and looping configuration. |
clustalw_phylogeny |
workflow |
|
Outputs (3)
Name |
Description |
fastaSeqs |
|
ClustalW_alignment |
|
treePhylip |
|
Datalinks (16)
Source |
Sink |
matchGroup:value |
Filter_IDs:group |
MatchPattern:value |
Filter_IDs:regex |
Split_IDs:split |
Filter_IDs:stringlist |
regex:value |
Split_IDs:regex |
EBI_NCBI_BLAST:getResult_2_output_output |
Split_IDs:string |
Filter_IDs:filteredlist |
REST_UniProt:id |
REST_UniProt:responseBody |
Merge_FastaSeqs:stringlist |
Merge_FastaSeqs:concatenated |
EMBL_EBI_ClustalW2_SOAP:Sequences |
emailAddress |
EMBL_EBI_ClustalW2_SOAP:Email_address |
emailAddress |
EBI_NCBI_BLAST:email |
Sequence |
EBI_NCBI_BLAST:sequence |
emailAddress |
clustalw_phylogeny:email |
EMBL_EBI_ClustalW2_SOAP:ClustalW_alignment |
clustalw_phylogeny:alignment |
Merge_FastaSeqs:concatenated |
fastaSeqs |
EMBL_EBI_ClustalW2_SOAP:ClustalW_alignment |
ClustalW_alignment |
clustalw_phylogeny:getResult_output_output |
treePhylip |
Uploader
License
All versions of this Workflow are
licensed under:
Version 1 (earliest)
(of 2)
Credits (2)
(People/Groups)
Attributions (1)
(Workflows/Files)
Shared with Groups (1)
Featured In Packs (1)
Log in to add to one of your Packs
Attributed By (1)
(Workflows/Files)
Favourited By (1)
Statistics
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment