Let me know if this is off topic, I considered stack exchange, but I was worried about the domain specific content.

I am performing association tests between a binary disease outcome a large set of multiallelic sites, each having *m* alleles with *Allele* representing the allelic dosage (0,1,2). I am leaving the most common allele out of the model to make it reference (*m - 1*).

I am including sex and the first *K* principal components (*PCs*) as covariates.

Each multiallelic sites is tested for association with the disease separately. This is what I am using to describe testing the association of a single multiallelic sites with the disease:

text{log}(odds of disease) = beta_0 + left(sumlimits_{j=1}^{m-1}beta_{1,j}Allele_jright) + left(sumlimits_{k=1}^{K} beta_{2,k}PC_kright) + beta_{3}Sex

Does this follow a convention that makes sense? If not are there adjustments that could make it more clear? Thank you.

Source link