gravatar for ido.idobar

2 hours ago by


I'm using BBDuk to trim adapters and quality-filter reads of WGS data from ancient DNA (>200 years old samples).
The reads are 150bp PE.
When I use the recommended parameters for PE reads (which I've used many times before), a large amount of my reads is being trimmed by ktrim=r (25-50%).

This is my command: -Xmx1g ref=$BBMAP_DIR/bbmap-38.79-0/resources/adapters.fa ktrim=r k=23 pigz=f mink=11 hdist=1 qtrim=rl trimq=10 tpe tbo int minlen=30 ziplevel=9 threads=12 in=./D15_#.fastq.gz out=trimmed_reads/trimmed_D15_#.fastq.gz stats=D15.stats ow

And this is the output:

Input:                          93969406 reads          14189380306 bases.
QTrimmed:                       393623 reads (0.42%)    1688764 bases (0.01%)
KTrimmed:                       90265990 reads (96.06%)         6997869744 bases (49.32%)
Trimmed by overlap:             970534 reads (1.03%)    4869518 bases (0.03%)
Total Removed:                  347220 reads (0.37%)    7004428026 bases (49.36%)
Result:                         93622186 reads (99.63%)         7184952280 bases (50.64%)

I suspect that this might be due to the fragmented nature of the aDNA, resulting in short fragments, flanked by adapter sequences, but I'd like to have a second opinion, to make sure that I don't need to alter the parameters somehow to retain more "real" sequences.

Many thanks, Ido

Source link