gravatar for kspata

2 hours ago by


Hi All,

I have paired end 250 sequencing data for a sample. The read count is around 2 million. The data has 70-mer barcodes which are embedded in common upstream and downstream region.

I have to analyze how many unique barcodes are present in the sample and their frequency relative to total number of reads.
So far, I have mapped the reads to the reference which has N's in them for the barcode region. I found some common sequences which may be the barcodes.

I then merged forward and reverse reads with minimum overlap of 50 and grepped the observed barcode sequence. Although, I am sure of this approach is correct.

Is there any better way to perform this kind of analysis?

Help would be appreciated.

Thanks in advance !!

Source link