Get a sequence in fasta format given one of:
1. An NCBI GI number (e.g. 75251068).
2. An entry identifier in database:identifier format (e.g. uniprot:Q96247).
3. A sequence entry in a format supported by EMBOSS seqret.
Get fasta formated sequence for an entry identifer or a sequence entry.
Given a sequence or sequence entry identifer (e.g. uniprot:wap_rat), return the sequence in fasta format.
If a sequence identifier, in database:identifier format, is input the EBI's WSDbfetch web service (see http://www.ebi.ac.uk/Tools/webservices/services/dbfetch) is used to retrive the sequence in fasta format. Otherwise the input is assumed to be a sequence and if passed through the Soaplab EMBOSS seqret service to force the sequence into fasta format.
Return true if the input is a sequence or false if the input is a sequence identifer (e.g. uniprot:wap_rat).
lineLen = sequence.indexOf("\n");
if(lineLen < 1) {
lineLen = sequence.length();
}
if(!sequence.startsWith(">") &&
sequence.indexOf(":") > 0 &&
sequence.indexOf(":") < lineLen) {
is_sequence = "false";
} else {
is_sequence = "true";
}
sequence
is_sequence
Fetch the sequence in fasta format from the identifer using EBI's WSDbfetch service (see http://www.ebi.ac.uk/Tools/webservices/services/dbfetch).
fasta
raw
http://www.ebi.ac.uk/Tools/webservices/wsdl/WSDbfetch.wsdl
fetchData
Fails if the workflow input was a sequence (i.e. is an identifer).
org.embl.ebi.escience.scuflworkers.java.FailIfTrue
Fails if the workflow input is an identifier (i.e. is an actual sequence).
org.embl.ebi.escience.scuflworkers.java.FailIfFalse
Format sequence into fasta format.
fasta
http://www.ebi.ac.uk/soaplab/emboss4/services/edit.seqret
Either an actual sequence or an entry identifer in database:identifier format (e.g. uniprot:wap_rat).
Sequence in fasta format.
Completed
Fail_if_sequence
fetchData
Scheduled
Running
Completed
Fail_if_identifer
seqret
Scheduled
Running
Get the sequence in fasta format for a GI number.
Given an NCBI GI number get the sequence from the entry in fasta format. Uses the NCBI eUtils (see http://eutils.ncbi.nlm.nih.gov/entrez/query/static/esoap_help.html).
Note: XPath is used instead of XML splitters to avaoid a problem with cyclic references in the XML.
//*[local-name(.)='TSeq']/*[local-name(.)='TSeq_accver']
net.sourceforge.taverna.scuflworkers.xml.XPathTextWorker
//*[local-name(.)='TSeq']/*[local-name(.)='TSeq_sequence']
net.sourceforge.taverna.scuflworkers.xml.XPathTextWorker
fasta_seq = ">" + accver + " " + des + "\n";
fasta_seq += seq;
accver
des
seq
fasta_seq
//*[local-name(.)='eFetchResultMS']/*[local-name(.)='eFetchResult']/*[local-name(.)='TSeqSet']/*[local-name(.)='TSeq']
net.sourceforge.taverna.scuflworkers.xml.XPathTextWorker
nucleotide
75251068
fasta
org.embl.ebi.escience.scuflworkers.java.XMLInputSplitter
//*[local-name(.)='TSeq']/*[local-name(.)='TSeq_defline']
net.sourceforge.taverna.scuflworkers.xml.XPathTextWorker
org.embl.ebi.escience.scuflworkers.java.FlattenList
org.embl.ebi.escience.scuflworkers.java.FlattenList
org.embl.ebi.escience.scuflworkers.java.FlattenList
org.embl.ebi.escience.scuflworkers.java.FlattenList
org.embl.ebi.escience.scuflworkers.java.FlattenList
org.embl.ebi.escience.scuflworkers.java.StringListMerge
http://eutils.ncbi.nlm.nih.gov/entrez/eutils/soap/eutils.wsdl
run_eFetch_MS
NCBI GI number to get sequence from.
text/xml
Sequence in fasta format.
Sequence in XML format from eFetch.
Is the input a GI number?
//
// Test if input is a GI number.
//
is_gi = "false";
try {
if(Integer.valueOf(gi_id_seq) > 0) {
is_gi = "true";
}
}
catch(NumberFormatException ex) {
is_gi = "false";
}
gi_id_seq
is_gi
Fail if the sequence is a GI number.
org.embl.ebi.escience.scuflworkers.java.FailIfTrue
Fail is the sequence is not a GI number.
org.embl.ebi.escience.scuflworkers.java.FailIfFalse
Input sequence, GI number or entry identifier.
Sequence in fasta format.
Completed
Fail_if_GI
Sequence_or_ID
Scheduled
Running
Completed
Fail_if_sequence_or_id
Get_fasta_from_GI
Scheduled
Running