gravatar for karl.stamm

3 hours ago by

United States

Those tools (bowtie) think the reference is a genome, and there's no way a 150bp read came from a 21bp genome. So finding no matches is the correct result.

You could pad the reference with a series of NNN on both sides, to allow the aligner to find a place the 150bp read matches.

You could chop the read up into smaller parts (sub 20), and let each part find an alignment.

You could trim and only align the first 18 bases of each read.

I have some experience with reads longer than the DNA source, and the ends of the read are generally technical artifact, or illumina adapter sequence you can trim.

Or, finally if you don't want to do those things, then you need a different tool, because bowtie and tophat and bwa are looking to put reads onto the genome they came from and are not suited to your task.



Source link