How to combine multiple snp array datasets on different versions of same array design (GSA) ?
Greetings! I plan to do genetic association analysis for my study (case-control) design. I have three different datasets from different centers, containing separate individuals all suffering from a common disease.
Data1 = X number cases total (A number cases, B number controls) genotyped on Illumina GSA-V1
Data2 = Y number cases total (C cases, D contros) -- genotyped on Illumina GSA-V3
Data3 = Z number cases total (E cases, F controls) --genotyped on Illumina GSA-MD -V3
I need to combine the above datasets and run the association testing for combined cases against control. (It is not a meta-analysis)
My main concern here is regarding merging the three datasets genotyped on different versions of the GSA array.
Q1: Can I treat the three versions of GSA array as same array design OR do I treat them as separate designs?
Q2: Can I go ahead with a simple merge first (it definitely won’t be a simple merge) but what I mean is can I first merge all the three datasets presuming same/similar array design and then go ahead with further QC steps and downstream analysis? I have been doing lots of reading about this but it’s making me even more confused.
Extremely sorry if my questions sound naïve, but I am quite new to this field and still a lot to learn. Any suggestions would be greatly appreciated.
I plan to do my analysis in plink/R.
Thank you all in advance.
• 10 views