gravatar for prashantwaiker

3 hours ago by

Hello there,
I am new to bioinformatic analyses and want to do SNP calling from Illumina sequences. I have the raw reads from the sequencing of 192 haploid honey bee brothers from a single mother. This is old data in the lab where they sequenced paired-end as well as single-end. More specifically:

  • Genomic DNA from 192 samples was pooled with 96 samples per pool
    • Each pool sequenced on Illumina HiSeq 2000 in one 100-bp single-end run and two lanes in 100-bp paired-end runs
    • paired-end sequence data: 2 runs x 2 sequences x 192 samples = 768 sequences
    • single end sequence data: 1 run x 1 sequence each x 192 samples = 192 (1 sequence missing so we have 191)

I did FastQC and trimmomatic steps for quality filtering and trimming. I am following some pipelines that I found and I believe the next step would be alignment to the reference genome. But how can I merge all the files into a single sequence so I can do alignment step? Or do I need to align single-end and paired-end separately?
Also, I would be greatful if anyone can provide me a simple tutorial/pipeline for SNP calling.

Thanks in advance,
Prashant



Source link