Create_SNP_SetEntrez_ID00 This takes an input of a list of genes in the form of entrez gene ids. 2012-08-21 01:34:30.184 UTC 5625 2012-08-21 01:34:53.533 UTC set_width00 10000 2012-08-21 01:33:23.686 UTC The allows the user to set the flanking width for the gene for determining SNPs 2012-08-21 01:33:18.864 UTC path_to_output_file00 This takes the input of the local directory where the user wants to store the output result of the workflow. 2012-08-21 01:32:05.656 UTC C:\Users\Gene_to_SNP_Report.txt 2012-08-21 01:32:39.264 UTC Split_string_into_string_list_by_regular_expressionstring0regex0split11 This service splits the string value into a string list at every occurrence of the specified regular expression. The regular expression provided in this case is a new line: "\n" 2012-08-21 01:35:26.478 UTC net.sf.taverna.t2.activitieslocalworker-activity1.4net.sf.taverna.t2.activities.localworker.LocalworkerActivity string 0 'text/plain' java.lang.String true regex 0 'text/plain' java.lang.String true split 1 l('text/plain') 1 workflow org.embl.ebi.escience.scuflworkers.java.SplitByRegex net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invokeregex_valuevalue00net.sf.taverna.t2.activitiesstringconstant-activity1.4net.sf.taverna.t2.activities.stringconstant.StringConstantActivity \n net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invokehsapiens_gene_ensemblhsapiens_gene_ensembl.entrezgene_filter1hsapiens_gene_ensembl.chromosome_name10hsapiens_gene_ensembl.end_position10hsapiens_gene_ensembl.start_position10 This is a Biomart service that takes the entrez gene id as an input and returns the chromosome name and start and end position of the gene. 2012-08-21 01:36:25.927 UTC net.sf.taverna.t2.activitiesbiomart-activity1.4net.sf.taverna.t2.activities.biomart.BiomartActivitynet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeSet_widthchr_in1end_in1start_in1in11chr_out11end_out11start_out11 The flanking region around the gene is calculated in this beanshell. The input is provided by the user. 2012-08-21 01:41:04.534 UTC net.sf.taverna.t2.activitiesbeanshell-activity1.4net.sf.taverna.t2.activities.beanshell.BeanshellActivity chr_in 1 text/plain java.lang.String true end_in 1 text/plain java.lang.String true start_in 1 text/plain java.lang.String true in1 1 text/plain java.lang.String true chr_out 1 1 end_out 1 1 start_out 1 1 workflow net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeConvert_to_Kegg_idin10out100 The input to btit (which provides gene information from the kegg database) is a Kegg gene id. To convert entrez gene ids to Kegg ids "hsa:" is concatenated to the entrez gene id. 2012-08-21 01:38:27.325 UTC net.sf.taverna.t2.activitiesbeanshell-activity1.4net.sf.taverna.t2.activities.beanshell.BeanshellActivity in1 0 text/plain java.lang.String true out1 0 0 workflow net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invokebtitstring0return00 The input for this service is Kegg gene id in the form of (for example) hsa:1234. It returns the gene name and definitions of the given entry id. 2012-08-21 01:46:04.599 UTC net.sf.taverna.t2.activitieswsdl-activity1.4net.sf.taverna.t2.activities.wsdl.WSDLActivity http://soap.genome.jp/KEGG.wsdl btit net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 10 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Invokehsapiens_snphsapiens_snp.chr_name_filter0hsapiens_snp.chrom_end_filter0hsapiens_snp.chrom_start_filter0hsapiens_snp.affy610 This is a Biomart service that takes the chromosome name, and start and end positions and finds all the Affy genechip 6k SNP ids present in the region. The Affy6 ids can easily be changed to any other desired gene chip id like Illumina chip ids or other version of the Affymetrix gene chip. 2012-08-21 01:43:23.846 UTC net.sf.taverna.t2.activitiesbiomart-activity1.4net.sf.taverna.t2.activities.biomart.BiomartActivitynet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeFlatten_List_2inputlist2outputlist11 This service flattens the inputlist by one level. It returns the result of the flattening. 2012-08-21 01:47:27.934 UTC net.sf.taverna.t2.activitieslocalworker-activity1.4net.sf.taverna.t2.activities.localworker.LocalworkerActivity inputlist 2 l(l('')) [B true outputlist 1 l('') 1 workflow org.embl.ebi.escience.scuflworkers.java.FlattenList net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeCreate_Final_Reportin10in21result11 This aligns the SNP ID with the associated gene id and gene information in a tab-delimited format. 2012-08-21 01:47:13.492 UTC net.sf.taverna.t2.activitiesbeanshell-activity1.4net.sf.taverna.t2.activities.beanshell.BeanshellActivity in1 0 text/plain java.lang.String true in2 1 text/plain java.lang.String true result 1 1 workflow net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeFlatten_Listinputlist2outputlist11 This service flattens the inputlist by one level. It returns the result of the flattening. 2012-08-21 01:47:32.603 UTC net.sf.taverna.t2.activitieslocalworker-activity1.4net.sf.taverna.t2.activities.localworker.LocalworkerActivity inputlist 2 l(l('')) [B true outputlist 1 l('') 1 workflow org.embl.ebi.escience.scuflworkers.java.FlattenList net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeCreate_and_populate_temporary_file_2content0filepath00 This service creates a temporary file in a local tmp directory. 2012-08-21 01:47:46.397 UTC net.sf.taverna.t2.activitiesbeanshell-activity1.4net.sf.taverna.t2.activities.beanshell.BeanshellActivity content 0 'text/plain' java.lang.String true filepath 0 'text/plain' 0 workflow net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 10 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 3 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeConcatenate_Files_2filelist1outputfile0 This service examines the files whose paths or URLs are specified in the filelist. The content of those files is concatenated. 2012-08-21 01:48:00.558 UTC net.sf.taverna.t2.activitieslocalworker-activity1.4net.sf.taverna.t2.activities.localworker.LocalworkerActivity filelist 1 l('text/plain') java.lang.String true outputfile 0 'text/plain' java.lang.String true displayresults 0 'text/plain' java.lang.String true results 0 'text/plain' 0 workflow net.sourceforge.taverna.scuflworkers.io.ConcatenateFileListWorker net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 0 0 0 net.sf.taverna.t2.coreworkflowmodel-impl1.4net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeSplit_string_into_string_list_by_regular_expressionstringEntrez_IDSplit_string_into_string_list_by_regular_expressionregexregex_valuevaluehsapiens_gene_ensemblhsapiens_gene_ensembl.entrezgene_filterSplit_string_into_string_list_by_regular_expressionsplitSet_widthchr_inhsapiens_gene_ensemblhsapiens_gene_ensembl.chromosome_nameSet_widthend_inhsapiens_gene_ensemblhsapiens_gene_ensembl.end_positionSet_widthstart_inhsapiens_gene_ensemblhsapiens_gene_ensembl.start_positionSet_widthin1set_widthConvert_to_Kegg_idin1Split_string_into_string_list_by_regular_expressionsplitbtitstringConvert_to_Kegg_idout1hsapiens_snphsapiens_snp.chr_name_filterSet_widthchr_outhsapiens_snphsapiens_snp.chrom_end_filterSet_widthend_outhsapiens_snphsapiens_snp.chrom_start_filterSet_widthstart_outFlatten_List_2inputlistbtitreturnCreate_Final_Reportin1Flatten_List_2outputlistCreate_Final_Reportin2hsapiens_snphsapiens_snp.affy6Flatten_ListinputlistCreate_Final_ReportresultCreate_and_populate_temporary_file_2contentFlatten_ListoutputlistConcatenate_Files_2filelistCreate_and_populate_temporary_file_2filepathConcatenate_Files_2outputfilepath_to_output_file 9971e8b3-f864-4f45-8877-cfb4a4daba51 2012-08-20 23:46:53.175 UTC Create_SNP_Set 2012-08-21 01:26:03.518 UTC f952b103-ef22-4be3-ac1d-b74dfd3ed759 2012-08-20 23:48:41.754 UTC 1bc9c960-19f3-47b2-b55f-e08fd396d608 2012-08-21 01:30:54.485 UTC 62e7dbfa-08ce-417c-babd-17ede7bbcd9e 2012-08-21 01:46:06.229 UTC 6825187b-b700-400e-8502-f419444a7889 2012-08-20 23:58:57.796 UTC 944b8dbc-d6c8-4199-98a3-538b40d9f142 2012-08-21 01:48:03.170 UTC 8e5c8fbd-7af2-40a3-a2e1-59c89cb49c40 2012-08-20 23:52:19.4 UTC Harish Dharuri 2012-08-21 01:25:40.568 UTC The purpose of the workflow is to determine SNPs in the vicinity of the genes and create a SNP set for a given set of genes. The user has the freedom to choose the flanking width around the gene for determining the SNPs. The input is in the form of entrez gene ids. Biomart services are used to determine the chromosome and position of the gene as well as determining Affy gene chip 6k ids. The final report is stored as a tab-delimited text file with Affy 6 gene chip ids for the SNP and Kegg info for the gene that it is associated with. 2012-08-21 01:30:54.250 UTC 055e4fc4-d4c3-4809-ab0f-c6a15153ec5f 2012-08-21 01:38:28.961 UTC 615206c1-3632-48b2-9eb6-41a94e59a074 2012-08-21 01:24:50.589 UTC a7075126-6c0e-4b73-ad07-c31c1c178a8a 2012-08-21 00:03:10.234 UTC