I am fairly new to bioinformatics and I'm stuck on how to go about this.
I have been trying to prune snps based on LD from a large vcf file. I have mainly followed this really nice tutorial on how to do this in plink: evomics.org/learning/population-and-speciation-genomics/2016-population-and-speciation-genomics/fileformats-vcftools-plink/
However, when I have reverted my plink binary files back to vcf format, I am excluding the INFO and QUAL data in the vcf file.
I had also looked into VCFtools to do this using:
vcftools --vcf <original vcf file> --snps snps_ld_0.8.prune.in --recode --recode-INFO-all --out <new vcf>
But I end up with a blank vcf file or with only the header metadata.
During the tutorial (linked above) it makes me create a chromosome map that generates id's for all the snps - do I need to add these to my original vcf file (which only has '.' in the ID columns), if so can anyone recommend how I would go about this?
Thanks for your help.