I trying to use gatk's haplotype caller to call variants on pbmm2 aligned PacBio CCS reads thusly:
gatk HaplotypeCaller --reference ../30L02and17L22.2snps.fa --input m200715_080201_42269_c101526542550000001823318810042135_s1_p0.ccs.bam-vs-25C11and66H02.2snps.sort.bam --output m200715_080201_42269_c101526542550000001823318810042135_s1_p0.ccs.bam-vs-25C11and66H02.2snps.gatk.vcf
one of the variants called, for example, is
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT UnnamedSample 30L02 108 . C G 796.03 . AC=2;AF=1.00;AN=2;DP=18;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=29.43;SOR=0.914 GT:AD:DP:GQ:PL 1/1:0,18:18:54:810,54,0
My question is that the coverage is shown as DP=18, but when I look at the alignment used for this analysis the coverage at this position is 81. It seems like the haplotypecaller is not using all the available data but may be filtering some out (though the reads here look good). Any ideas whats going on and how I can deal with it?