How to use 1000 Genomes data for LDheatmap package in R

1

I am trying to visualize LD blocks within 1Mb flanking a SNP. And I don't want to use Haploview because it uses Hap Map 3 (build 17 assembly) which is quite outdated. So I downloaded SNP data from 1000 Genomes phase 3, using the online tool "VCF to PED converter". I got .ped and .info files. Then I used an R package ‘LDheatmap’ (which can calculate the LD in r^2 and can visualize LD in heatmap). But the files (.ped, .info files) from 1000 Genomes are not compatible input files for LDheatmap.

The example data set for LDheatmap, "CEUData", contains a data frame and a vector. The format is like this:

  • CEUSNP: A dataframe of SNP genotypes. Each row represents an individual. Each column represents a SNP. SNP IDs are headers of each column.
  • CEUDist: A vector of integers, representing SNP physical map locations on the chromosome.

Does any one know how to convert .ped and .info files from 1000 Genomes into compatible input files (dataframe and vector) for LDheatmap package in R?


SNP


R


lingkage


ldheatmap


heatmap

• 3.0k views

updated 2 hours ago by

0

written 5.7 years ago by

0



Source link