gravatar for curious

2 hours ago by

I am trying to merge 3 sets of bcfs that contain hundreds of millions of sites for tens of thousands of samples. These bcfs have the exact same sites, just all different samples. using bcftools merge 1.bcf 2.bcf 3.bcf -Ob > merged.bcf looks like it is going to take days, maybe even weeks at the rate it is going. Even though they are both binary formats, would it be faster to first convert these each to plink then:

plink --make-bed --merge-list merge_list.txt --out merged

where merge_list.txt is a list of my binary plinks for each bcf:

1
2
3



Source link