Find Orthologs for proteins in Ensembl
Created: 2011-10-03 14:33:04
Last updated: 2011-10-03 14:33:07
Find Orthologs for proteins in Ensembl using biomart.
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Titles (1)
Find Orthologs for proteins in Ensembl |
Descriptions (1)
Find Orthologs for proteins in Ensembl using biomart. |
Dependencies (0)
Inputs (4)
Name |
Description |
ensemblDataset |
Ensembl dataset name corresponding to the specie for protein ID. Look for Ensembl dataset names in http://www.biomart.org/biomart/martservice?type=datasets&mart=ensembl
|
idReference |
The reference database for the protein ID. Three options: ensembl, swissprot or trembl
|
proteinId |
Protein identifier
|
orthologSpecie |
Chose in which specie you want to find the orthologs. i.e human, mouse, rabbit, yeast
|
Processors (1)
Name |
Type |
Description |
findOrthologs |
beanshell |
Scriptif ((proteinId == void) || (proteinId == null) || proteinId.equals("")) {
throw new RunTimeException("port proteinId must have a non-empty value");
}
if ((ensemblDataset == void) || (ensemblDataset == null) || ensemblDataset.equals("")) {
throw new RunTimeException("port ensemblDataset must have a non-empty value");
}
if((idReference == void) || (idReference == null) ||idReference.equals("")) {
throw new RunTimeException("port idReference must have a non-empty value");
}
if((orthologSpecie == void) || (orthologSpecie == null) ||orthologSpecie.equals("")) {
throw new RunTimeException("port orthologSpecie must have a correct or non-empty value");
}
String queryFormat = "CSV";
String queryFilter = "";
if(idReference.equalsIgnoreCase( "swissprot" )){
queryFilter = "uniprot_swissprot_accession";
} else if (idReference.equalsIgnoreCase( "trembl" )){
queryFilter = "uniprot_sptrembl";
} else if (idReference.equalsIgnoreCase( "ensembl" )) {
queryFilter = "ensembl_peptide_id";
} else {
throw new RunTimeException("port idReference must have a correct value");
}
// Biomart does not like encoded XML :-(
String queryXml = "http://www.ensembl.org/biomart/martservice?query=" +
"%3C?xml%20version=%271.0%27%20encoding=%27UTF-8%27?%3E" +
"%3C!DOCTYPE%20Query%3E" +
"%3CQuery%20%20virtualSchemaName%20=%20%27default%27%20formatter%20=%20%27"+ queryFormat + "%27%20header%20=%20%270%27%20uniqueRows%20=%20%271%27%20count%20=%20%27%27%20datasetConfigVersion%20=%20%270.6%27%20%3E" +
"%3CDataset%20name%20=%20%27" + ensemblDataset.toLowerCase() + "%27%20interface%20=%20%27default%27%20%3E" +
"%3CFilter%20name%20=%20%27" + queryFilter + "%27%20value%20=%20%27" + proteinId + "%27%20/%3E" +
"%3CAttribute%20name%20=%20%27ensembl_peptide_id%27%20/%3E" +
"%3CAttribute%20name%20=%20%27" + orthologSpecie.toLowerCase() + "_homolog_ensembl_peptide%27%20/%3E" +
"%3CAttribute%20name%20=%20%27" + orthologSpecie.toLowerCase() + "_orthology_type%27%20/%3E" +
"%3CAttribute%20name%20=%20%27" + orthologSpecie.toLowerCase() + "_homolog_subtype%27%20/%3E" +
"%3CAttribute%20name%20=%20%27" + orthologSpecie.toLowerCase() + "_homolog_perc_id%27%20/%3E" +
"%3CAttribute%20name%20=%20%27" + orthologSpecie.toLowerCase() + "_homolog_perc_id_r1%27%20/%3E" +
"%3CAttribute%20name%20=%20%27" + orthologSpecie.toLowerCase() + "_homolog_ds%27%20/%3E" +
"%3CAttribute%20name%20=%20%27" + orthologSpecie.toLowerCase() + "_homolog_dn%27%20/%3E" +
"%3C/Dataset%3E" +
"%3C/Query%3E";
List esemblId = new ArrayList();
List orthologId = new ArrayList();
List orthologyType = new ArrayList();
List homologSubtype = new ArrayList();
List homologPercId = new ArrayList();
List homologPercIdR1 = new ArrayList();
List homologDs = new ArrayList();
List homologDn = new ArrayList();
List summary = new ArrayList();
URL url = new URL(queryXml);
BufferedReader reader = new BufferedReader (new InputStreamReader(url.openStream()));
//read each line of text file
String line = null;
while((line = reader.readLine()) != null){
if(line.length() > 1){
summary.add(proteinId + "," + line);
String[] st = line.split( "," );
for (int i = 0; i < st.length; i++) {
if(i==0){
esemblId.add( st[i] );
} else if(i==1){
orthologId.add( st[i] );
} else if(i==2){
orthologyType.add( st[i] );
} else if(i==3){
homologSubtype.add( st[i] );
} else if(i==4){
homologPercId.add( st[i] );
} else if(i==5){
homologPercIdR1.add( st[i] );
} else if(i==6){
homologDs.add( st[i] );
} else if(i==7){
homologDn.add( st[i] );
}
}
}
} |
Beanshells (1)
Name |
Description |
Inputs |
Outputs |
findOrthologs |
|
proteinId
ensemblDataset
idReference
orthologSpecie
|
esemblId
orthologId
summary
orthologyType
homologSubtype
homologPercId
homologDs
homologDn
homologPercIdR1
|
Outputs (9)
Name |
Description |
esemblId |
|
summary |
|
orthologId |
|
homologDn |
|
homologDs |
|
homologPercId |
|
homologSubtype |
|
orthologyType |
|
homologPercIdR1 |
|
Datalinks (13)
Source |
Sink |
ensemblDataset |
findOrthologs:ensemblDataset |
idReference |
findOrthologs:idReference |
proteinId |
findOrthologs:proteinId |
orthologSpecie |
findOrthologs:orthologSpecie |
findOrthologs:esemblId |
esemblId |
findOrthologs:summary |
summary |
findOrthologs:orthologId |
orthologId |
findOrthologs:homologDn |
homologDn |
findOrthologs:homologDs |
homologDs |
findOrthologs:homologPercId |
homologPercId |
findOrthologs:homologSubtype |
homologSubtype |
findOrthologs:orthologyType |
orthologyType |
findOrthologs:homologPercIdR1 |
homologPercIdR1 |
Uploader
License
All versions of this Workflow are
licensed under:
Version 1
(of 1)
Credits (1)
(People/Groups)
Attributions (0)
(Workflows/Files)
None
Shared with Groups (0)
None
Featured In Packs (0)
None
Log in to add to one of your Packs
Attributed By (0)
(Workflows/Files)
None
Favourited By (0)
No one
Statistics
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment