gravatar for NGSCanBioinf

3 hours ago by


I use GATK best practices in my analysis (mainly dnaseq pipeline) and as it is suggested the pipeline calls genotypes on all the samples together and at the end creates an "allSamples.vcf.gz" file.
At this stage one approach would be to perform filtering (e.g. removing low read depth variants) and annotation (e.g. gnomAD, CADD, etc.) on this multi-sample VCF or is it better to first break this VCF file into single-sample VCF files and do the downstream analysis on those?

One issue that I see with the first approach is that for each variant some samples could have enough read depth and not other ones so it comes down to choosing "variant-specific" filters or "sample-specific" filters. Would appreciate your feedback/suggestion on this matter.

Source link