I'm working with GATK to do the SNP calling of some target capture sequencing data. Right now I'm creating the bam files and I was wondering which are the standard quality measures I should apply to my bam files. I'm aware that I can mark duplicates, but I don't know how this could affect the consequent analyses. Are they removed or they are just marked and I have to do something else? Should I specify some value scores for my bam files?
Thank you very much for your help!