gravatar for curious

4 hours ago by

I made a VCF from BAMs from whole exome sequencing by following GATK. I ran the pipeline by focusing on intervals provided for the target regions to speed things up.

I padded my intervals 100 bp on either side because I think I read that capture can still be OK outside the target regions by about that much.

By gazing at the VCF this seems to be OK for some sites, but in other cases I will see a big increased in missed calls that appear to almost always fall near or on my padded regions. These variants pass the other standard quality metrics that people sometimes hard filter on.

Is it reasonable to run the VCF through a call-rate filter and keep all variants with call rate 99% or above to circumvent this?


modified 3 hours ago

4 hours ago


Source link