snpEff ERROR_CHROMOSOME_NOT_FOUND GRCh37.75 genome

1

Dear Biostar community,

I have a targeted resequencing experiment (Illumina) with the goal to detect mutations in certain genes. For this purpose, to align reads, I used the GRCh37 genome from NCBI (www.ncbi.nlm.nih.gov/genome/guide/human/). I used bcftools to call the variants and until this step everything was fine. However, when I reached the annotation step and used a prebuilt database from snpEff with the command:

java -Xmx32g -jar snpEff.jar GRCh37.75 variants_norm.vcf > annotated.vcf

It does not produce an appropriate annotation .vcf file. Instead .vcf file is full of "ERROR_CHROMOSOME_NOT_FOUND"

So far, it is one of the most common problems described in snpEff documentation:
pcingola.github.io/SnpEff/se_troubleshooting/

Chromosome names in genome .fasta file are looked as

NC_000001.10 Homo sapiens chromosome 1, GRCh37.p13 Primary Assembly

It seems to me, ensemble names were used in a pipeline.

Could you help me please, how can I convert the reference genome to a format that snpEff can process or where I can find a release that could suit the snpEff variant annotation format? I tried to search for a solution and have not found it.

Thank you in advance


annotaion


vcf


genomics


variant


snpEff

• 46 views



Source link