How To Split A Bam File By Chromosome


I am having a hard time opening a very large bedgraph. I have been suggested to split my bam file by chromosome with but it didn't work.
Is there any other alternative?





updated 2 hours ago by


written 9.1 years ago by



in this other answer, Aaron Quinlan stated:

bamtools has a "split" command for exactly this purpose

I can only add that I've just tried it with this simple command

bamtools split -in file.bam -reference

and it works like a charm. the bam file gets split into different bam files, which are suffixed with .REF_xxx.bam by default, which is very convenient.

Try samtools: samtools view -?

A region should be presented in one of the following formats:
`chr1',`chr2:1,000' and `chr3:1000-2,000'. When a region is
specified, the input alignment file must be an indexed BAM file.

something like samtools view in.bam chr1 > chr1.bam should work

samtools view in.bam chr1 -b > out.bam

Use -b to output bam format

updated 21 months ago by


written 8.0 years ago by



I wrote a java tool to split a BAM per chromosome see

It also creates an empty BAM (filled with a pair of mock SAMRecords) for each chromosome in the Reference, if no SAMRecord was found for the chromosome.

You can use the following pipeline to extract chrY reads from the raw bam files and with the header

samtools sort A.bam -o A.sort.bam
samtools index A.sort.bam
samtools view -H A.sort.bam > output.extraction.sam
samtools view A.sort.bam chrY >> output.extraction.sam
samtools view -hb output.extraction.sam > output.extraction.bam
samtools view  -H output.extraction.bam

output.extraction.bam is the bam file which extracted chrY reads.

updated 21 months ago by


written 5.5 years ago by



There is also a nice blog post about this by Sam Nicholls here.

tl;dr (extracted from the blog post)

samtools view -H in.bam | grep -P '^@SQ' | cut -f 2 -d ':' | cut -f 1 | while read contig; do
    samtools faidx reference.fa $contig > my_contig.fa
    java -jar picard.jar CreateSequenceDictionary R=my_contig.fa O=my_contig.dict
    java -jar picard.jar ReorderSam INPUT=in.bam OUTPUT=out-${contig}.bam REFERENCE=my_contig.fa S=true VERBOSITY=WARNING
    rm my_contig.fa my_contig.dict

This will go through the input whole bam file (in.bam) and make separate bams for all the contigs (out-${contig}.bam) in the reference fasta (reference.fa). The output bam files are compatible with most of the common bioinformatics software. Of course, you can skip the for loop and just use the contig name of your choice instead.

samtools view -b in.bam chr1 > out.bam

did not work for me.

samtools view -b in.bam 1  > out.bam


before adding your answer.

Traffic: 1223 users visited in the last hour

Source link