gravatar for Pierre Lindenbaum

4 hours ago by

France/Nantes/Institut du Thorax - INSERM UMR1087

extract CHROM:POS from dbsnp, convert to CHROM:POS,RSID , sort on CHROM:POS (time consumming)...

wget -O - "ftp://ftp.ncbi.nih.gov/snp/organisms/human_9606/VCF/00-All.vcf.gz" | gunzip -c | awk -F 't' '/[^#]/ {printf("%s:%s,%sn",$1,$2,$3)}' | LC_ALL=C sort -T . -t, -k1,1 > dbsnp.csv

sort your data on CHROM:POS. Assuming the chrom notation is the same as dbsnp ('1' not 'chr1'), assuming the genome build is the same as dbsnp.

LC_ALL=C sort -T . -t, -k1,1 yourlist.txt > sorted.csv

join both list

LC_ALL=C join -t, -1 1 -2 1 dbsnp.csv sorted.csv > output.txt



Source link