Hi, this is my first time using maker genome annotation pipeline.

I recently finished maker's first round and was surprised from the results I got (was expecting better results).

I used minimap2 to align a de novo transcriptome to the reference genome and let maker do the alignments of known Crustacean protein sequences and mRNA sequences of my specie from NCBI.

Prior to running maker I used BUSCO to evaluate my de novo transcriptome assembly and the genome (using metaeuk):

Transcriptome: C:99.6%[S:7.4%,D:92.2%],F:0.2%,M:0.2%,n:1013
Genome: C:88.5%[S:37.7%,D:50.8%],F:7.8%,M:3.7%,n:1013

I ran BUSCO on all the transcripts maker predicted to evaluate the results:


Although this is only the first round, what might cause ~160 BUSCOs missing from maker's predictions?

Can anyone please share from his experience, is it common?

Maybe I was over expecting and these are actually good first round results?

Regarding training ab initio annotation tools, would you use BUSCO as Augustus training?
I have seen some tutorials which takes training sequences from mRNA annotations created in the first round (with 1000bp on each side), while others recommend filtering them (like in this: gene set filter/selection for training ab initio annotation tools ) and straight Augustus training

Thanks for consideration and help.

