Mark Duplicates

1

Hello

I have started learning the steps in NGS pipeline.

First, using Bowtie, I aligned the fastq sequence file with the reference file and the output was saved as sam file.

bowtie2 -x grch38_1kgmaj  -U 24_1.fastq,24_2.fastq -S eg1.sam 

The using samtools, sorted the sam file based on coordinates which was saved as bam file.

 sort eg1.sam > my_sorted.bam

This bam file was indexed

 samtools index my_sorted.bam

Now, I am trying to mark duplicates in this indexed file (using picard tool)

 java -jar $EBROOTPICARD/picard.jar MarkDuplicates I=my_sorted.bam.bai O=marked_duplicates.bam M=marked_dup_metrics.txt

I am stuck in this step. I did not get any output. Are the above steps correct or have I missed any step before marking duplicates?

Thanks


picard


GATK


NGS

• 81 views

You need to use .bam file as input and not the bam index. i.e your command should be:

 java -jar $EBROOTPICARD/picard.jar MarkDuplicates I=my_sorted.bam O=marked_duplicates.bam M=marked_dup_metrics.txt


Login
before adding your answer.



Source link