gravatar for lxiao63

3 hours ago by

I have downloaded genomic data for 1000 g phase I samples from

I checked the resultant .FAM file (1092 rows, each corresponds to 1 sample in 1000 g phase I release) and noted that there is a column named member whose first 20 cases are :

HG00096 HG00097 HG00099 HG00100 HG00101 HG00102 HG00103 HG00104
HG00106 HG00108 HG00109 HG00110 HG00111 HG00112 HG00113 HG00114
HG00116 HG00117 HG00118 HG00119

I wish to determine the population (eg, CHB, JPT, CEU) and super population (eg, EAS, EUR, AFR) from the member IDs. To do so, I downloaded pedigree file from

The pedigree file has 3501 rows rather than 1092 rows. This file has a column namded Individual ID whose contents are: HG01879, HG01880, HG01881, etc. However, none of the member in my .FAM file can be found among the 3501 rows of the pedigree file! These two files are completely irrelevant.

I would like to ask if it is possible to determine population source of the 1092 1000 g samples from their member ID. If yes, where could I find such meta data that relates ID to population source?

Thank you.


modified 2 hours ago



3 hours ago


Source link