I'm working with two genomes from a whole-genome sequencing study. The coverage is 10x and 8x respectively and i'm aligning them to the reference genome of a sister species. This analysis was already carried out by a previous colleague of the lab, and I would like to obtain the same results as him, as I need to map other samples.
He told me he used bwa version 0.7.7 (which I assume is incorrect, missing a 1 before the last 7) with these parameters:
-l 16500 -n 0.01 -o 2
From this and bwa's website (bio-bwa.sourceforge.net/bwa.shtml) I assume he used the aln logarythm, as mem does not have these commands.
As I always did alignments with mem algorythm, I run it and the size of the two BAM files obtained were 10 times bigger than his, which seems to me a huge difference...
I'm approaching this the right way? In order to repeat his analyses I'm running:
bwa aln -l 16500 -n 0.01 -o 2 -t 10 $GEN $fastq1 $fastq2 -o $outdir/genome1.bam
Where GEN is the reference genome and fastq1 and fastq2 my paired data of my first genome.
Thanks in advance for any input!