Variant_Annotation_wg_VCFs00 /home/murilo/Dropbox/Doutorado/Taverna/gvcf/586.head.bam.g.vcf,/home/murilo/Dropbox/Doutorado/Taverna/gvcf/943.head.bam.g.vcf 2016-02-04 17:33:59.437 UTC Comma separated list of g.vcf files 2016-02-16 16:42:00.597 UTC Ouput_path00 The path in which files will be written. If it does not exists, the folder will be created. 2016-02-16 16:42:47.895 UTC /home/murilo/Dropbox/Doutorado/Taverna/results/Example 2016-02-16 16:43:02.183 UTC Project_Name00 EXAMPLE 2016-02-16 16:43:53.120 UTC Project name. This will be the name you final vcf will recieve. 2016-02-16 16:43:48.95 UTC STDERR1FINAL_VCF0VEPvalue00 Complete path to the VEP perl file. http://www.ensembl.org/info/docs/tools/vep/script/index.html 2016-02-16 16:39:39.88 UTC net.sf.taverna.t2.activitiesstringconstant-activity1.5net.sf.taverna.t2.activities.stringconstant.StringConstantActivity /home/murilo/Dropbox/Doutorado/Taverna/tavernaTeste/ensembl-tools-release-84/scripts/variant_effect_predictor/variant_effect_predictor.pl net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeGATKvalue00 Complete path to GATK jar file. https://www.broadinstitute.org/gatk/download/ 2016-02-16 16:37:17.756 UTC net.sf.taverna.t2.activitiesstringconstant-activity1.5net.sf.taverna.t2.activities.stringconstant.StringConstantActivity /home/murilo/Dropbox/Doutorado/Taverna/tavernaTeste/picard-tools-1.141/GenomeAnalysisTK.jar net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeGATK_GenotypeGVCFsGATK0REF0DBSNP0g_VCFs0STDERR00VCF00 Here we generate a list of g.vcfs to enter the GenotypeGVCFs command line GenotypeGVCFs perform joint genotyping on gVCF files produced by HaplotypeCaller 2016-02-22 17:17:17.653 UTC net.sf.taverna.t2.activitiesexternal-tool-activity1.5net.sf.taverna.t2.activities.externaltool.ExternalToolActivity 789663B8-DA91-428A-9F7D-B3F3DA185FD4 default local <?xml version="1.0" encoding="UTF-8"?> <localInvocation><shellPrefix>/bin/sh -c</shellPrefix><linkCommand>/bin/ln -s %%PATH_TO_ORIGINAL%% %%TARGET_NAME%%</linkCommand></localInvocation> aea4312c-900e-49b5-a10f-360576078ef8 # Here we generate a list of g.vcfs to enter the GenotypeGVCFs command line myfiles=%%g_VCFs%% gVCFlist=`echo $myfiles | sed 's/^/-V /' | sed 's/,/ -V /g'` # GenotypeGVCFs perform joint genotyping on gVCF files produced by HaplotypeCaller java -Xmx1g -jar %%GATK%% -T GenotypeGVCFs -R %%REF%% -o VCF $gVCFlist -D %%DBSNP%% 1200 1800 DBSNP GATK REF g_VCFs g_VCFs g_VCFs false false false UTF-8 false false false GATK GATK false false false UTF-8 false false false DBSNP DBSNP false false false UTF-8 false false false REF REF false false false UTF-8 false false false VCF VCF false false false true 0 false net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeREFvalue00 Reference genome. Notice you must index the fasta file as in http://gatkforums.broadinstitute.org/gatk/discussion/1601/how-can-i-prepare-a-fasta-file-to-use-as-reference and with "bwa index" command. 2016-02-16 16:38:58.887 UTC net.sf.taverna.t2.activitiesstringconstant-activity1.5net.sf.taverna.t2.activities.stringconstant.StringConstantActivity /home/murilo/Dropbox/Doutorado/Taverna/tavernaTeste/REF/genome.fa net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeDBSNPvalue00 Location to a VCF file containing variations from DBSNP. ftp://ftp.ncbi.nih.gov/snp/ 2016-02-16 16:36:33.834 UTC net.sf.taverna.t2.activitiesstringconstant-activity1.5net.sf.taverna.t2.activities.stringconstant.StringConstantActivity /home/murilo/Dropbox/Doutorado/Taverna/tavernaTeste/dbsnp.vcf net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeVEP_Variant_Effect_PredictorVCF0VEP0STDERR00VEP_VCF00 Variant Effect Predictor http://www.ensembl.org/info/docs/tools/vep/script/index.html The VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions. 2016-02-22 17:17:59.578 UTC net.sf.taverna.t2.activitiesexternal-tool-activity1.5net.sf.taverna.t2.activities.externaltool.ExternalToolActivity 789663B8-DA91-428A-9F7D-B3F3DA185FD4 default local <?xml version="1.0" encoding="UTF-8"?> <localInvocation><shellPrefix>/bin/sh -c</shellPrefix><linkCommand>/bin/ln -s %%PATH_TO_ORIGINAL%% %%TARGET_NAME%%</linkCommand></localInvocation> 1f267998-9bc6-4aee-b0de-a196a390f573 #################################################################### # Variant Effect Predictor ######################################### # http://www.ensembl.org/info/docs/tools/vep/script/index.html ##### #################################################################### # The VEP determines the effect of your variants (SNPs, insertions, # deletions, CNVs or structural variants) on genes, transcripts, and # protein sequence, as well as regulatory regions. perl %%VEP%% -i VCF -o VEP_VCF --cache --merged --pubmed --sift b --polyphen b --ccds --uniprot --symbol --numbers --domains --regulatory --canonical --protein --biotype --gene_phenotype --gmaf --variant_class --vcf 1200 1800 VEP VCF VCF true false false UTF-8 false false false VEP VEP false false false UTF-8 false false false VEP_VCF VEP_VCF false false false true 0 false net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeSave_VCFFILENAME0PATH_OUT0PROJECT_NAME0 Saves the resulting vcf file from the workflow 2016-02-22 17:18:16.865 UTC net.sf.taverna.t2.activitiesexternal-tool-activity1.5net.sf.taverna.t2.activities.externaltool.ExternalToolActivity 789663B8-DA91-428A-9F7D-B3F3DA185FD4 default local <?xml version="1.0" encoding="UTF-8"?> <localInvocation><shellPrefix>/bin/sh -c</shellPrefix><linkCommand>/bin/ln -s %%PATH_TO_ORIGINAL%% %%TARGET_NAME%%</linkCommand></localInvocation> 22bfa75c-8c8d-4193-962f-57785bd57037 # Saves the resulting vcf file from the workflow # Output path to the folder PATH_OUT=%%PATH_OUT%% # Complete file name instead of a simbolic link FILE=`realpath FILENAME` # If the folder exists... if [ -e "$PATH_OUT" ] then # ... the vcf file will be copied to it cp $FILE $PATH_OUT/%%PROJECT_NAME%%.vcf # If not... else # A folder will be created before coping the file mkdir $PATH_OUT cp $FILE $PATH_OUT/%%PROJECT_NAME%%.vcf fi 1200 1800 PATH_OUT PROJECT_NAME PATH_OUT PATH_OUT false false false UTF-8 false false false FILENAME FILENAME true false false UTF-8 false false false PROJECT_NAME PROJECT_NAME false false false UTF-8 false false false false false false 0 false net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Parallelize 1 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.ErrorBouncenet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Failovernet.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.Retry 1.0 1000 5000 0 net.sf.taverna.t2.coreworkflowmodel-impl1.5net.sf.taverna.t2.workflowmodel.processor.dispatch.layers.InvokeGATK_GenotypeGVCFsGATKGATKvalueGATK_GenotypeGVCFsREFREFvalueGATK_GenotypeGVCFsDBSNPDBSNPvalueGATK_GenotypeGVCFsg_VCFsg_VCFsVEP_Variant_Effect_PredictorVCFGATK_GenotypeGVCFsVCFVEP_Variant_Effect_PredictorVEPVEPvalueSave_VCFFILENAMEVEP_Variant_Effect_PredictorVEP_VCFSave_VCFPATH_OUTOuput_pathSave_VCFPROJECT_NAMEProject_NameSTDERRGATK_GenotypeGVCFsSTDERRSTDERRVEP_Variant_Effect_PredictorSTDERRFINAL_VCFVEP_Variant_Effect_PredictorVEP_VCF a66307ec-fe03-4738-9290-81f12e9de898 2016-02-15 17:32:14.729 UTC 79110ea0-2d72-45f6-812f-1975b1369930 2016-02-16 13:24:09.139 UTC b855b1d9-3d2a-486e-94e1-519766a32599 2016-02-04 16:59:44.571 UTC 3216d581-ca61-474d-ac30-e437503891b6 2016-02-15 18:20:52.361 UTC 422f4fc5-721c-4716-8cff-b8d9104a207f 2016-02-15 20:53:50.559 UTC 1be7d9ae-4b9d-494e-b962-f2037a74e3ae 2016-02-16 13:28:51.264 UTC 2cb52336-86a1-4f29-ad21-3d63bcb5d707 2016-02-04 12:22:02.887 UTC 3007b90d-794c-4221-844e-012eb59913d0 2016-02-04 12:31:47.771 UTC d21130c8-196a-4d96-9063-c2f126b87287 2016-02-16 13:32:34.648 UTC 24395a2e-b519-4d3b-949e-7a0ab5c5e76c 2016-02-04 16:53:14.500 UTC 22230ba0-8d2a-4312-9e42-db50e39866be 2016-02-12 12:04:20.647 UTC 8d2bbad1-5ec6-4975-9ef2-12fe02892613 2016-02-04 12:29:41.132 UTC 7a11b303-730e-4404-ac84-b491ae3e44a8 2016-02-04 12:26:01.633 UTC 440269bc-472a-467f-bd5d-0da83d4011da 2016-05-11 16:55:26.726 UTC 9d1344df-deb8-47eb-b2f6-afb50ec9e28f 2016-02-15 21:02:32.522 UTC 75c1f2e4-c0ec-4f86-a027-6d02b77f40e5 2016-02-04 16:54:26.361 UTC 50c9a33f-c815-4ddd-91d2-318b06d2a09f 2016-02-16 12:57:20.699 UTC e0d2675c-0107-4245-8233-4b8bc1192454 2016-02-05 14:03:46.492 UTC f3fafc38-367a-4620-9c6d-33e405227b06 2016-02-16 13:31:02.928 UTC 7ffde927-9e48-4dc7-9ff3-127a1803a719 2016-02-15 18:38:25.816 UTC e64be3d5-be2f-49de-87b3-7e00b60f65ea 2016-02-04 17:30:53.823 UTC 33fcc7f8-6796-47b9-8596-1e0c4157b25e 2016-02-04 17:16:49.400 UTC f32691c4-8fe5-466b-a468-6e41a3c8984b 2016-02-15 18:46:33.37 UTC 47b28671-2a20-4825-b7c3-5ce5d910c86e 2016-02-11 13:13:57.164 UTC Here, from a list of g.vcf files, we execute GenotypeGVCFs command line GenotypeGVCFs perform joint genotyping on gVCF files produced by HaplotypeCaller Variant Effect Predictor http://www.ensembl.org/info/docs/tools/vep/script/index.html The VEP determines the effect of your variants (SNPs, insertions, deletions, CNVs or structural variants) on genes, transcripts, and protein sequence, as well as regulatory regions. 2016-05-16 19:16:03.370 UTC dde9b1b7-10c8-4e6f-890c-91bdf8224882 2016-02-05 14:05:52.236 UTC f3dd98d2-eaeb-4b0d-ae18-577ed901dbc0 2016-02-05 14:02:17.97 UTC 8f2b2f41-4cac-413e-a60a-a3d62e30ed12 2016-02-11 13:06:22.803 UTC 159eac45-dea4-493d-9ca3-3ceca45a5e43 2016-02-04 17:07:11.708 UTC 5d6daede-1115-40fa-a1ff-73e92a29dc11 2016-02-04 17:14:20.682 UTC 8454ef84-8c65-4616-aae0-d5f110f59e97 2016-02-04 16:55:41.33 UTC 1025bfcf-e4a2-4d64-a719-ffa8f27f3d00 2016-02-04 17:12:47.319 UTC 7d725a50-b896-4ba8-87d2-eb22e688fb51 2016-02-11 15:57:12.294 UTC 88b377fd-114f-45d0-abd0-5821ab009bb8 2016-02-15 18:11:48.207 UTC 52503a10-a7d4-4060-9a95-ea6b18df2f45 2016-02-04 16:43:49.329 UTC b91da746-2cc7-4d12-b0a4-3a15f01f561a 2016-02-15 17:45:29.490 UTC c4f649ab-61b3-4c70-9c11-d4bc2fe23b88 2016-02-05 14:00:46.617 UTC 54f4593c-3b7e-4471-a59c-43a326d82779 2016-05-16 19:16:25.130 UTC 8d5dcd4b-b8c7-4f95-9d00-290e201dd82f 2016-02-15 17:28:20.501 UTC ee8ef976-8a14-4bf7-9908-91445b00b399 2016-02-16 13:38:27.269 UTC 2114e3df-a0fb-49a0-ac94-baae2a4aaaa7 2016-02-05 13:54:46.659 UTC 4841f976-5848-4513-88d4-203017a5e550 2016-02-05 13:57:19.557 UTC 9cd20c50-65dc-44b5-a4c6-8a8837cf7fa9 2016-02-22 17:18:28.635 UTC 29d02c06-0b2a-4a9f-8655-8dee17ab4732 2016-02-15 18:01:07.374 UTC 9fe35e46-8c5e-4baa-96a4-04756558bf86 2016-02-04 13:13:06.401 UTC Variant Annotation with VEP (Variant Effect Predictor) from GATK .gvcf files 2016-05-16 19:13:36.585 UTC af7c2bf5-1464-4de5-ba41-f44ea7c19bae 2016-02-04 12:23:36.554 UTC 1335c6a0-8b16-4f3d-aee6-c080b1410638 2016-02-15 17:48:59.569 UTC 996c3bb9-e502-4b94-844b-41295af5b9ac 2016-02-15 17:52:04.715 UTC f7957caa-c7e9-4b74-9d5f-621ea6b6985c 2016-02-15 18:41:49.748 UTC 45ab853a-9923-4292-bea5-ca6f00d5af0c 2016-02-05 12:23:32.858 UTC bfee508a-3bd1-4bf0-99f2-87d0dd7e4e8b 2016-02-11 13:11:39.44 UTC 337dc258-1dfd-48b7-908f-00b0b6f033a9 2016-02-04 17:34:05.190 UTC 98c36ec8-b149-423f-851e-3484afc41b26 2016-02-16 16:48:16.578 UTC 14a522ff-29c1-49ec-8e97-bdc8255a55df 2016-02-04 12:33:56.409 UTC f58e728a-761e-4b87-b4d5-c5d30db71000 2016-02-04 16:54:52.992 UTC Murilo GuimarĂ£es Borges 2016-05-16 19:13:27.926 UTC b09f7078-97cb-413c-a1cf-899e17f5d3ea 2016-02-04 12:33:15.954 UTC 5a58f494-c330-4aff-b34f-869bc74f9d3c 2016-02-12 12:25:52.455 UTC e3d0c358-5ead-4eff-ad4e-66e39e066ff2 2016-02-11 13:12:34.202 UTC 45c20f27-8799-4c02-9acd-efa3985537a3 2016-02-15 18:43:47.994 UTC 458dcc01-2273-4ddc-9e1c-d4f9bb0e67c5 2016-02-04 12:27:19.16 UTC 5de0a432-16a2-4c5d-9571-ce5cb7b51ea0 2016-02-15 18:14:28.43 UTC 00677665-a889-48e1-af0c-183da6f7d943 2016-02-11 12:41:27.740 UTC a1cedb4b-d4e1-474e-95d8-e89ac6a4f22f 2016-02-04 12:28:06.14 UTC