VCF Filter On Small Genomes
I am working on a yeast species (Candida glabrata) NGS data to find any mutations related to drug resistance. I am new in bioinformatics so I am using Galaxy.eu to get use to algorithms. There is literature about some genes that mutations in theese genes are related to drug resistance. So I decided to get the .gb format file of one of theese genes (PDR1 gene- aprroximately 3.5kbp ) from NCBI and map ( with BWA-MEM) my FASTQ reads ( WGS data) on them. Then I follow theese steps respectively ;
1) remove duplicates by Mark duplicate
2) BAM-left allign
3) filter by mapping quality and filtering unpaired reads
4) Call variants with freebayes ( ploidy as 1 )
After calling variants with free bayes , I have used vcf filter on the vcf file to get rid of bad quality variant calls.
However I had no variants after I run the filter.
So I decided to annotate the variants of original vcf file ( unfiltered) using snpeff , then use snpsift to extract fields and look for quality mesurments of the original vcf file.
I found that QUAL and DP, AO,RO ( I mean RO was generally 0 and AO was generally above 300 ) values are looking good, but SRP,SAP,EPP values are low ( below 20 mostly, for SRP values are nearly all 0 ).
I know SRP,SAP and EPP values are important for looking strand and position bias.
My initial question is : Can working on a small gene ( approximately 3.5bp ) be the reason for such low values?
I would be so grateful If you have an explanation.
• 21 views