bcftools filter by protein prediction
I have VEP annotated vcf files with following content:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT file chr1 183937 . G A 58.9 PASS CSQ=||||||||||||MODIFIER|FO538757.1|ENSG00000279928|ENST00000624431|unprocessed_pseudogene||4/4|||||;AC=1;AN=2 GT:GQ:DP:AD:VAF:PL 0/1:51:26:15,11:0.423077:58,0,51 chr1 601436 . C T 4.9 PASS CSQ=||||||||||||MODIFIER|AL669831.3|ENSG00000230021|ENST00000634337|processed_transcript|4/5||404||||,||||||||||||MODIFIER|AL669831.3|ENSG00000230021|ENST00000634833|processed_transcript|3/6||317||||;AC=1;AN=2 GT:GQ:DP:AD:VAF:PL 0/1:5:26:19,7:0.269231:3,0,17
I would like to filter out protein coding variants, but get following errors:
bcftools view -f "protein_coding" file > out [E::bcf_write] Broken VCF record, the number of columns at chrX:152737049 does not match the number of samples (0 vs 1) [main_vcfview] Error: cannot write to (null) bcftools filter -i 'BIOTYPE="protein_coding"' file > aaa [filter.c:2491 filters_init1] Error: the tag "BIOTYPE" is not defined in the VCF header
How should I filter such variants, if the field is in CSQ field between pipes?
• 79 views