gravatar for zpl015

2 hours ago by

I'm a bit confused with the description of Watterson's Theta here Watterson_estimator.

Wiki says it can be estimated by the number of segregating sites / the (n-1)th harmonic number. Earlier in the page, n is referred to the number of haploid individuals in the sample, but I also see somewhere else use n for the number of sequences. For example, theta.s: Population Parameter THETA using Segregating Sites. Is the number of sequences = number of individuals * number of chromosomes?

Also, for the number of segregating sites, I thought this is the number of variants across the whole genome called from a sample of x individuals, so the formula should give a single number, how come some softwares like ANGSD calculates the Watterson's Theta per site?

Many thanks for your help in advance.

Source link