GWA QC needed for combining genotyping array data from different chip versions
I have received a set of genotyping data in PLINK (bed) format.
- Of these, around 950 European samples are genotyped using GSAv1 illumina array and around 70 samples are genotyped using illumina GSAv2 array.
- 96% of SNP IDs match between these two chips versions.
I fed the genotyping data (of the shared SNPs) to my analysis pipeline and created the PC plots:
- PCs of anchored samples for assigning population ancestry:
- PCs of EUR population before GWA. Results appear to overlap between chips [images below]
However when I run GWA against chip versions (with 10 PCs and age as covariates), I get some significant hits [Image below].
I would be very grateful for any recommendations/suggestions on how to go about combine the two genotyping datasets?
Many thanks for your feedbacks and time beforehand,
• 77 views