I am trying to merge vcf files across chromosomes 1-22. I am using bcftools v1.9 in order to do this. The code I am using is
bcftools merge 'myfile1.vcf.gz' 'myfile2.vcf.gz'etc....
'myfile22.vcf.gz' -o myfile1_22.vcf.gz
However I get the following error: "Error: Duplicate sample names (1310229_1310229), use --force-samples to proceed anyway."
I'm afraid to use
--force-samples because I don't understand how this will affect the merged vcf file and how many duplicates there are. The data is from the UK Biobank and the VCF files are massive in size (total across chromosomes =1.3TB).
Any suggestions to actually solve the error rather than use
NOTE: I am VERY VERY new to biostatistical analysis. I appreciate your advice heavily. I would appreciate it more if your advice was structured for a beginner.