Starts by fetching all gene IDs from Ensembl corresponding to human genes on chromosome 22 implicated in known diseases and with homologous genes in rat and mouse. For each of these gene IDs it fetches the 200bp after the five prime end of the genomic sequence in each organism and performs a multiple alignment of the sequences using the EMBOSS tool 'emma' (a wrapper around ClustalW). Returns PNG images of the multiple alignment along with three columns containing the human, rat and mouse gene IDs used in each case.
As an enhancement to show the Jmol rendering in action this also fetches the PDB identifiers (where present) and the corresponding coordinate flat files from the RCSB, presenting the structures to the user in an interactive form via the Jmol plugin.
List HSOut = new ArrayList();
List RatOut = new ArrayList();
List MouseOut = new ArrayList();
Map hsToMouse = new HashMap();
Iterator j = MouseGeneIDs.iterator();
for (Iterator i = HSGeneIDs.iterator(); i.hasNext();) {
String id = (String)i.next();
hsToMouse.put(id, j.next());
}
Map hsToRat = new HashMap();
j = RatGeneIDs.iterator();
for (Iterator i = HSGeneIDs.iterator(); i.hasNext();) {
String id = (String)i.next();
hsToRat.put(id, j.next());
}
// Build the unique outputs
for (Iterator i = hsToRat.keySet().iterator(); i.hasNext();) {
String hsID = (String)i.next();
String ratID = (String)hsToRat.get(hsID);
// Remove version number
// ratID = (ratID.split("."))[0];
String mouseID = (String)hsToMouse.get(hsID);
// Remove version number
//mouseId = (mouseID.split("."))[0];
if (ratID != null && mouseID != null) {
HSOut.add(hsID);
RatOut.add(ratID.split("\\.")[0]);
MouseOut.add(mouseID.split("\\.")[0]);
}
}
HSGeneIDs
MouseGeneIDs
RatGeneIDs
HSOut
RatOut
MouseOut
org.embl.ebi.escience.scuflworkers.java.FlattenList
Map geneIDToStructureSet = new HashMap();
Iterator j = StructureIDList.iterator();
for (Iterator i = GeneIDList.iterator(); i.hasNext();) {
String geneID = (String)i.next();
String structureID = (String)j.next();
Set s = (Set)geneIDToStructureSet.get(geneID);
if (s == null) {
s = new HashSet();
geneIDToStructureSet.put(geneID, s);
}
s.add(structureID);
}
List GeneIDs = new ArrayList();
List StructureIDs = new ArrayList();
// Now has a map containing lists of structure IDs as values
for (Iterator i = geneIDToStructureSet.keySet().iterator(); i.hasNext();) {
String geneID = (String)i.next();
Set structureIDSet = (Set)geneIDToStructureSet.get(geneID);
GeneIDs.add(geneID);
List sl = new ArrayList();
sl.addAll(structureIDSet);
StructureIDs.add(sl);
}
StructureIDList
GeneIDList
GeneIDs
StructureIDs
rnorvegicus_gene_ensembl
mmusculus_gene_ensembl
hsapiens_gene_ensembl
fasta = ">Human\n"+hsSeq+"\n>Mouse\n"+mmSeq+"\n>Rat\n"+rnSeq;
hsSeq
mmSeq
rnSeq
fasta
hsapiens_gene_ensembl
hsapiens_gene_ensembl
Given an identifier such as '1crn' fetches the PDB format flatfile from the RCSB
Given an identifier such as '1crn' fetches the PDB format flatfile from the RCSB
&compression=None
org.embl.ebi.escience.scuflworkers.java.StringConcat
http://www.rcsb.org/pdb/cgi/export.cgi/1CRN.pdb?format=PDB&pdbId=
org.embl.ebi.escience.scuflworkers.java.StringConcat
org.embl.ebi.escience.scuflworkers.java.WebPageFetcher
PDB identifier such as '1crn'
chemical/x-pdb
text/plain
text/html
Reads and writes (returns) sequences
http://www.ebi.ac.uk/soaplab/services/edit::seqret
Displays aligned sequences, with colouring and boxing
http://www.ebi.ac.uk/soaplab/services/alignment_multiple::prettyplot
Multiple alignment program - interface to ClustalW program
http://www.ebi.ac.uk/soaplab/services/alignment_multiple::emma
image/png
application/octet-stream
The array of png images returned from the plot processor
http://www.mygrid.org.uk/ontology#domain_concept
chemical/x-pdb
text/plain
text/html