gravatar for yolek64754

2 hours ago by

Hi there,
I have been trying to compute "Pi" on my Ch22 using VCFtools; this is the pipeline I am using:

Data=$1 

bcftools mpileup --ignore-RG -f /ref/human_g1k_v37.fa.gz $Data | bcftools call -m -Oz -f GQ -o $Data"_ND.vcf.gz" 

zcat $Data"_ND.vcf.gz" | vcftools --vcf - --site-pi --out $Data"_nucleotide_diversity" 

awk '$1== "22" { print $0 }' $Data"_nucleotide_diversity.sites.pi" > $Data"_nucleotide_diversity22.sites.pi"

Then the OUTPUT looks like this:

22      16060564        0
22      16060565        0
22      16060566        0
22      16060567        1
22      16060568        0
22      16060569        0
22      16060570        0
22      16060571        0
22      16060572        0
22      16060573        0
22      16060574        0
22      16060575        0

I was adding all the value of the 3rd column and then dividing it by the total number of sites on my chromosome, but after being the results, it does seems that this looks like I am computing the heterozegozity level and not Pi ?

How can I get the Pi (nucleotide diversity) using this output ? Or am I doing anything wrong in the command before ?

Thanks

link

modified 2 hours ago

written
2 hours ago
by

yolek6475430



Source link