alignement and Variant calling with BWA and samtools

1

Hi,

I am trying a variant calling pipeline with BWA and Samtools. Somehow when I have more sequences to align to references the output is less variant and I lose the Indels calls (even though the sequences carrying them are still present in the extended list). Could it be due to default parameters only calling variants if they are represented by a minimum % of the population?

Here is my pipeline:

Index:

bwa index ref_file

Align:

bwa mem  ref_file all_genomes_file > alignement_file.sam

SAM to BAM:

samtools view -S -b s alignement_file.sam > alignement_file.bam

Sort:

samtools sort alignement_file.bam -o sorted.bam

Variant calling:

bcftools mpileup -B -f sorted.bam | bcftools call -mv  --multiallelic-caller --variants-only  > var.raw.vcf


variantcalling


BWA


samtools

• 92 views

The more reads you have, the more reliable the variant calling process is. If using more reads produces less variants, it means that those previously detected variants are not real variants.

From your pipeline I see that you're not removing or at least marking duplicates before calling variants. If your data comes from an hibridization enrichment process this could be a major issue you should address using duplicate removing tools such as Picard, sambamba, samblaster,...


Login
before adding your answer.



Source link