Transmembrane domain prediction using EMBOSS tmap with an input sequence alignment of homolouges:
1. Sequence similarity search (SSS) to find homologues
2. Fetch sequences of hits
3. Multiple sequence alignment (MSA) of hit sequences
4. EMBOSS tmap with alignment from 3.
Uses the EBI web services:
1. WSFasta (see http://www.ebi.ac.uk/Tools/webservices/services/fasta)
2. WSDbfetch (see http://www.ebi.ac.uk/Tools/webservices/services/dbfetch)
3. WSClustalW2 (see http://www.ebi.ac.uk/Tools/webservices/services/clustalw2)
4. Soaplab EMBOSS tmap
Note: currently this workflow does not attempt to add the query sequence into the set of sequences passed to the multiple alignment. Thus it is most suitable for searches using entires which are persent in the searched database (i.e. will be included via the self hit).
Find homologues for the input sequence via a sequence similarity search (SSS). In this case FASTA is used.
Run a FASTA or SSEARCH sequence similarity search using the EBI's WSFasta service (see http://www.ebi.ac.uk/Tools/webservices/services/fasta).
Unpack plain text FASTA report from byte[] into string.
org.embl.ebi.escience.scuflworkers.java.ByteArrayToString
Input data structure, adds a type to the input sequence.
sequence
org.embl.ebi.escience.scuflworkers.java.XMLInputSplitter
Unpack XML FASTA report from byte[] into string.
org.embl.ebi.escience.scuflworkers.java.ByteArrayToString
Wrap the input sequence in a list.
org.embl.ebi.escience.scuflworkers.java.XMLInputSplitter
Parameters for the FASTA/SSEARCH job.
Protein
10
10
0.00001
0.0
1
org.embl.ebi.escience.scuflworkers.java.XMLInputSplitter
Submit the FASTA/SSEARCH job.
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSFasta.wsdl
runFasta
Check for job completion.
org.embl.ebi.escience.scuflworkers.java.FailIfFalse
if(job_status.equals("DONE")) {
is_done = "true";
} else {
is_done = "false";
}
job_status
is_done
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSFasta.wsdl
checkStatus
Get the FASTA report as text.
tooloutput
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSFasta.wsdl
poll
Get the FASTA report as XML.
toolxml
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSFasta.wsdl
poll
Get the hit identifers.
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSFasta.wsdl
getIds
Query sequence (fasta format recommended) or sequence identifer in database:identifer format (e.g. uniprot:wap_rat).
The database to search (e.g. uniprot).
Your e-mail address.
The FASTA program to run (e.g. fasta3, fastf3, fasts3, fastx3, fasty3, tfastx, tfasty).
FASTA program output as plain text. Note the exact format of the output depends on the chosen FASTA program.
FASTA output in an XML format.
List of the identifiers of the hits found.
The identifier of the job at EBI.
Completed
Poll_FASTA_Job
getIds
Scheduled
Running
Completed
Poll_FASTA_Job
Get_Text_Result
Scheduled
Running
Completed
Poll_FASTA_Job
Get_XML_Result
Scheduled
Running
Get the sequences for the hits from the sequence similarity search (SSS) to be used for the multiple sequence alignment (MSA).
From a list of sequence entry identifiers and a database name, fetch the sequences in fasta format using EBI's WSDbfetch service (see http://www.ebi.ac.uk/Tools/webservices/wsdl/WSDbfetch.wsdl).
Reformat the list of identifiers into a comma-delimited string for use with fetchBatch.
,
org.embl.ebi.escience.scuflworkers.java.StringListMerge
Get a set of database entries (see http://www.ebi.ac.uk/Tools/webservices/services/dbfetch#fetchbatch_db_ids_format_style)
fasta
raw
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSDbfetch.wsdl
fetchBatch
List of entry identifers from a specific database.
Name of the database to which the identifiers belong. For example "uniprot".
Set of sequences in fasta format.
Database to search in the sequence similarity search (SSS) step.
uniprot
Program to use for the sequence similarity search (SSS).
fasta3
Perform a multiple sequence alignment (MSA) using the sequences found by the sequence similarity search (SSS).
Perform a ClustalW multiple sequence alignment using the EBI’s WSClustalW2 service (see http://www.ebi.ac.uk/Tools/webservices/services/clustalw2). The set of sequences to align are the input, the other parameters for the search (see Job_params) are allowed to default.
sequence
org.embl.ebi.escience.scuflworkers.java.XMLInputSplitter
org.embl.ebi.escience.scuflworkers.java.XMLInputSplitter
org.embl.ebi.escience.scuflworkers.java.ByteArrayToString
1
org.embl.ebi.escience.scuflworkers.java.XMLInputSplitter
org.embl.ebi.escience.scuflworkers.java.ByteArrayToString
org.embl.ebi.escience.scuflworkers.java.ByteArrayToString
Get the results of a job (see http://www.ebi.ac.uk/Tools/webservices/services/clustalw2#poll_jobid_type)
tooloutput
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSClustalW2.wsdl
poll
Submit a ClustalW analysis job (see http://www.ebi.ac.uk/Tools/webservices/services/clustalw2#runclustalw2_params_content)
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSClustalW2.wsdl
runClustalW2
Get the results of a job (see http://www.ebi.ac.uk/Tools/webservices/services/clustalw2#poll_jobid_type)
tooldnd
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSClustalW2.wsdl
poll
Check for job status, and wait if job not finished.
Check status of job.
If job not finished fail.
org.embl.ebi.escience.scuflworkers.java.FailIfFalse
Map job status into true/false is done flag
if(job_status.equals("DONE")) {
is_done = "true";
} else {
is_done = "false";
}
job_status
is_done
Get the status of a submited job (see http://www.ebi.ac.uk/Tools/webservices/services/clustalw2#checkstatus_jobid)
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSClustalW2.wsdl
checkStatus
EBI job identifier for the job to check.
Status of the job.
Get the results of a job (see http://www.ebi.ac.uk/Tools/webservices/services/clustalw2#poll_jobid_type)
toolaln
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSClustalW2.wsdl
poll
Sequences to align (fasta format recommended).
User e-mail address.
The alignment in ClustalW format.
Guide tree used to produce the final alignment.
text/xml
EBI job identifier
Completed
EBI_ClustalW2_poll_job
Get_alignment_result
Scheduled
Running
Completed
EBI_ClustalW2_poll_job
Get_guide_tree_result
Scheduled
Running
Completed
EBI_ClustalW2_poll_job
Get_output_result
Scheduled
Running
From the multiple sequence alignment (MSA) predict the transmembrane regions.
png
http://www.ebi.ac.uk/soaplab/emboss4/services/protein_2d_structure.tmap
User e-mail address.
Sequence to analyse for transmembrane regions. Either the actual sequence (fasta format recommended) or an entry identifer in database:identifer format (e.g. uniprot:LPHN2_RAT).
EBI job identifer for the sequence similarity search.
EBI job identifer for the multiple sequence alignment (MSA).
Report from tmap describing the predicted transmembrane regions.
Identifiers of the entries found by the sequence similarity search (SSS).
The multiple sequence alignment (MSA) produced for input to tmap.
image/png
Plot of the score used by tmap and the predicted regions.