BowtieToPileup
Created: 2010-09-16 11:07:53
This example workflow aligns short sequencing reads to a reference genome using Bowtie and generates a SAMtools pileup file. By analysing an actual data set (SNP detection in N. vitripennis) and translating this analysis pipeline into a Taverna workflow, I was able to come up with an easy way of using Taverna for such analysis. I created a Java API (with my limited Java experience) that wraps the command line programs used in the analysis pipeline: Bowtie and some of the SAMtools. Instead of passing Taverna or the API data, only references to files are used. The API does not have a main entry point, instead, each step in the analysis pipeline is represented by a short Beanshell script that calls the appropriate method of the API. These scripts are used as services. This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). Please note that Bowtie and the SAMtools need to be installed and in the path. The API needs to be present in the .taverna/lib directory, please check dependencies of the Beanshell services. Assumes Linux.
Preview
Run
Run this Workflow in the Taverna Workbench...
Workflow Components
Authors (1)
Titles (1)
Descriptions (5)
This example workflow aligns short sequencing reads to a reference genome using Bowtie and generates a SAMtools pileup file. The main problem with sequencing data is the sheer amount. By analysing an actual data set (SNP detection in N. vitripennis) and translating this pipeline into a Taverna workflow, I was able to come up with an easy way of using Taverna for such analysis. I created a Java API (with my limited Java experience) that wraps the command line programs used in the analysis pipeline: Bowtie and some of the SAMtools. Instead of passing Taverna or the API data, only references to files are used. The API does not have a main entry point, instead, each step in the analysis pipeline is represented by a short Beanshell script that calls the appropriate method of the API. These scripts are used as services. This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). |
This workflow
This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). |
This example workflow aligns short sequencing reads to a reference genome using Bowtie and generates a SAMtools pileup file. The main problem with sequencing data is the sheer amount. By analysing an actual data set (SNP detection in N. vitripennis) and translating this pipeline into a Taverna workflow, I was able to come up with an easy way of using Taverna for such analysis. I created a Java API (with my limited Java experience) that wraps the command line programs used in the analysis pipeline: Bowtie and SAMtools/Picard
This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). |
This example workflow aligns short sequencing reads to a reference genome using Bowtie and generates a SAMtools pileup file. By analysing an actual data set (SNP detection in N. vitripennis) and translating this analysis pipeline into a Taverna workflow, I was able to come up with an easy way of using Taverna for such analysis. I created a Java API (with my limited Java experience) that wraps the command line programs used in the analysis pipeline: Bowtie and some of the SAMtools. Instead of passing Taverna or the API data, only references to files are used. The API does not have a main entry point, instead, each step in the analysis pipeline is represented by a short Beanshell script that calls the appropriate method of the API. These scripts are used as services. This workflow is part of my bachelors thesis (bioinformatics at the Hanze University Groningen, the Netherlands). Please note that Bowtie and the SAMtools need to be installed and in the path. The API needs to be present in the .taverna/lib directory, please check dependencies of the Beanshell services. Assumes Linux. |
Takes raw reads and reference genome and returns pileup |
Dependencies (1)
Inputs (7)
Name |
Description |
forewardReadsFileNames |
A list of all forward reads files. These are generally the files starting with s_N_1, where N is the number of the pair.
|
reverseReadsFileNames |
A list of all reverse reads files. These are generally the files starting with s_N_2, where N is the number of the pair.
|
alignmentBasename |
Full path and name of the desired alignment file, but without an extension. A logfile with the same name (with .log) will be created.
|
referenceGenome |
Full path and name of reference genome file.
|
indexBasename |
Desired bowtie index base name. Bowtie-build will generate 6 files with this base name: .1.ebwt, .2.ebwt, .3.ebwt, .4.ebwt, .rev.1.ebwt, and .rev.2.ebwt. Bowtie will use this index, the original reference genome file is no longer used by bowtie.
|
relativeIndexLocation |
Location to write index to. Is relative to the reference genome. Do not use leading slashes unless you wish to move up in the directory structure using '../'.
Location to write index to. Is relative to the reference genome. Do nut use leading slashes unless you wish to move up in the directory structure using '../'.
|
pileupBasename |
Desired name for the pileup file (without path)
|
Processors (6)
Name |
Type |
Description |
Bowtie_build |
workflow |
bowtieBuild creates a Bowite-specific index of the reference genome. |
Bowtie |
workflow |
Bowtie aligns the reads to the reference genome. |
samToBam |
workflow |
SAMtools SAM to BAM conversion. |
FilterAndSort |
workflow |
Filters all unaligned reads from the input BAM file and sorts the rest. Outputs the name and location of the filtered and sorted BAM file. |
Pileup |
workflow |
Generates a pileup (list with information on each genomic position) from a filtered and sorted BAM file. |
indexReference |
workflow |
indexReference creates a SAMtools-specific index of the reference genome. |
Beanshells (6)
Name |
Description |
Inputs |
Outputs |
pileup |
|
sortedBamName
referenceGenomeFilename
pileupName
faidx
|
rawPileup
|
samToBam |
|
samName
|
bamName
|
IndexReference |
|
referenceGenomeFilename
|
refIndex
|
bowtie |
|
readsFileNames
reverseReadsFileNames
indexBasename
alignmentBasename
|
samLocationAndBasename
err
|
bowtieBuild |
|
referenceGenome
indexBasename
indexLocation
|
bowtieIndexBasename
err
|
filterAndSort |
|
bamName
|
sortedBamName
|
Outputs (8)
Name |
Description |
bowtie_err |
String that may contain some of Bowties error output.
|
bowtieBuild_err |
String that may contain some of Bowties error output.
|
Pileup_rawPileup |
Final workflow output: location and name of the pileup file.
|
intermediate_bowtieIndexBasename |
Intermediate output for bowtie-build: the basename of the index files.
|
intermediate_bowtie_samLocationAndBasename |
Intermediate output for Bowtie: the path and base name of the created SAM alignment file.
|
intermediate_samToBam_bamName |
Intermediate output for samToBam: the resulting BAM file.
|
intermediate_filterAndSort_sortedBamName |
Intermediate output for filterAndSort: resulting BAM file.
|
intermediate_refIndex |
Intermediate output for indexReference: the SAMtools index.
|
Datalinks (22)
Source |
Sink |
referenceGenome |
Bowtie_build:referenceGenome |
indexBasename |
Bowtie_build:indexBasename |
relativeIndexLocation |
Bowtie_build:indexLocation |
forewardReadsFileNames |
Bowtie:bowtie_readsFileNames |
reverseReadsFileNames |
Bowtie:bowtie_reverseReadsFileNames |
alignmentBasename |
Bowtie:bowtie_alignmentBasename |
Bowtie_build:bowtieBuild_bowtieIndexBasename |
Bowtie:bowtie_indexBasename |
Bowtie:bowtie_samLocationAndBasename |
samToBam:samToBam_samName |
samToBam:samToBam_bamName |
FilterAndSort:filterAndSort_bamName |
indexReference:refIndex |
Pileup:faidx |
pileupBasename |
Pileup:pileupName |
referenceGenome |
Pileup:referenceGenomeFilename |
FilterAndSort:sortedBamName |
Pileup:sortedBamName |
referenceGenome |
indexReference:referenceGenomeFilename |
Bowtie:bowtie_err |
bowtie_err |
Bowtie_build:err |
bowtieBuild_err |
Pileup:rawPileup |
Pileup_rawPileup |
Bowtie_build:bowtieBuild_bowtieIndexBasename |
intermediate_bowtieIndexBasename |
Bowtie:bowtie_samLocationAndBasename |
intermediate_bowtie_samLocationAndBasename |
samToBam:samToBam_bamName |
intermediate_samToBam_bamName |
FilterAndSort:sortedBamName |
intermediate_filterAndSort_sortedBamName |
indexReference:refIndex |
intermediate_refIndex |
Coordinations (1)
Controller |
Target |
Bowtie |
indexReference |
Uploader
License
All versions of this Workflow are
licensed under:
Version 1
(of 1)
Credits (0)
(People/Groups)
None
Attributions (0)
(Workflows/Files)
None
Shared with Groups (0)
None
Featured In Packs (0)
None
Log in to add to one of your Packs
Attributed By (0)
(Workflows/Files)
None
Favourited By (1)
Statistics
Other workflows that use similar services
(0)
There are no workflows in myExperiment that use similar services to this Workflow.
Comments (0)
No comments yet
Log in to make a comment