Human MSH3 exon 1 variant genotyping

Created: 2018-08-22 10:40:45      Last updated: 2018-08-22 15:48:08

This workflow is designed to genotype 9 bp tandem repeat region in MSH3 exon 1 as well as flanking variants described in Flower and Lomeikaite et al. (2018). 

Detailed description as follows:

1. Input files.

Input files for the workflow are demultiplexed fastq files from MiSeq Illumina sequencing. 

2. Read merging.

To enable obtaining haplotype information, forward and reverse paired-end reads are merged using Pear with default settings. 

3. Demultiplexing.

Merged reads are demultiplexed in a two-step process using Cutadapt tool by collecting reads that contain both forward and reverse gene-specific MSH3primer sequences.  

4. FastQC

Carried out after each demultiplexing step.

5. Reference files.

Reference .fasta file for read mapping contains multiple reference sequences that have different numbers and combinations of 9 bp MSH3repeat units with similar flanking sequence as in human reference genome (GRCh38). 

6. Read mapping.

Merged and demultiplexed files are mapped to multiple references using ‘Map with BWA-MEM’. 

7. Read mapping.

Mapped .bam files are converted to .sam format using BAM-to-SAM which are then filtered against multiple alignments (‘c5 > 0’) using Filter tool to obtain only unique alignments. Repeat counter is used to determine the reference sequences with highest numbers of aligned reads, i.e. determine the repeat genotypes.

8. Variant genotyping.

Variants are called using Naïve Variant Caller, resulting .vcf files are then filtered for allele frequency > 0.4 and converted to .tab fomat.

9. Output files.

Workflow produces four output files:

two FastQC read quality reports of demultiplexed reads; 

a sam file of aligned reads for visualising repeat genotypes; 

a tab file with repeat genotypes;

a vcf file with variant genotypes;

a tab file with variant genotype summary. 

Information Preview

Information Import

Not currently available.

Information Workflow Components

Inputs (20)
Steps (18)
Outputs (41)

Information Workflow Type


Information License

All versions of this Workflow are not licensed.

Information Version 1 (of 1)

Information Credits (0)



Information Attributions (0)



Information Tags (0)


Log in to add Tags

Information Shared with Groups (0)


Information Featured In Packs (0)


Log in to add to one of your Packs

Information Attributed By (0)



Information Favourited By (0)

No one

Information Statistics


Citations (0)


Version History

In chronological order:

Reviews Reviews (0)

No reviews yet

Be the first to review!

Comments Comments (0)

No comments yet

Log in to make a comment

Workflow Other workflows that use similar services (0)

There are no workflows in myExperiment that use similar services to this Workflow.