Hello,

I have a vcf file, which I want to upload on the Sanger Imputation server. The following error occured:

--- Aborted Job ---
The input file sanity check failed, "bcftools norm -ce" exited with the following message:
Reference allele mismatch at X:3155141 .. REF_SEQ:'T' vs VCF:'G'

As suggested by the sanger website, I wanted to solve this issue with the bcftools +fixref command.

All my SNPs have dbsnp-IDs, so I downloaded the following file for reordering alleles:
ftp.ncbi.nih.gov/snp/organisms/human_9606_b151_GRCh37p13/VCF/All_20180423.vcf.gz

When I now use the

bcftools +fixref broken.vcf -O z -o fixref.vcf -- -d -f /path/to/reference.fasta -i `All_20151104.vcf.gz`

command, the following error appears:

[E::bgzf_uncompress] Inflate operation failed: invalid distance too far back
[E::bgzf_read_block] Invalid BGZF header at offset 15203091877

It seems, that the All_20151104.vcf.gz file is corrupted. I also am not able to index it with bcftools. However, another operation (subsetting it to regions) works...

Does anyone know, how to solve this problem?

Best,

Andreas



Source link