I am new to bioinformatic analyses and want to do SNP calling from Illumina sequences. I have the raw reads from the sequencing of 192 haploid honey bee brothers from a single mother. This is old data in the lab where they sequenced paired-end as well as single-end. More specifically:
- Genomic DNA from 192 samples was pooled with 96 samples per pool
- Each pool sequenced on Illumina HiSeq 2000 in one 100-bp single-end run and two lanes in 100-bp paired-end runs
- paired-end sequence data: 2 runs x 2 sequences x 192 samples = 768 sequences
- single end sequence data: 1 run x 1 sequence each x 192 samples = 192 (1 sequence missing so we have 191)
I did FastQC and trimmomatic steps for quality filtering and trimming. I am following some pipelines that I found and I believe the next step would be alignment to the reference genome. But how can I merge all the files into a single sequence so I can do alignment step? Or do I need to align single-end and paired-end separately?
Also, I would be greatful if anyone can provide me a simple tutorial/pipeline for SNP calling.
Thanks in advance,