Consider some toy data that describes two SNPs, sex, and a disease state for some individuals

The SNPs are represented as the alternate allele dosage that ranges from 0 to 2 since they imputed.

I want to test the association between the SNPs and the disease with sex as a covariate.

data <- data.frame("snp1"=c(runif(n=50, min=0, max=.2),
                            runif(n=50, min=1.5, max=2),
                            runif(n=50, min=0, max=.2),
                            runif(n=50, min=1.5, max=2)),
                    "snp2"=c(runif(n=50, min=0, max=.2),
                             runif(n=50, min=0, max=.2),
                             runif(n=50, min=1.5, max=2),
                             runif(n=50, min=1.5, max=2)),
                   "sex"=rbinom(200, 1, 0.5),
                   "disease"=c(rbinom(150, 1, 0.1),
                               rbinom(50, 1, 0.9))
                   )

I think I understand that I could do a single SNP association test like this:

single_snp_test <- glm(disease ~ snp1 + sex, data=data, family="binomial")

Lest say that instead of a single snp test I wanted to use these dosage data to test the association of a haplotype that contains both snp1 AND snp2. How would I go about doing this haplotype-based type of association test? in psudocode something like this:

snp1_and_snp2_haplotye_test <- glm(disease ~ (snp1 AND snp2) + sex, data=data, family="binomial")



Source link