Calculate coverage


I realise this has been asked a lot, but I havent found a tool that fits my simple needs.

I would like to calculate coverage % for an alignment file (sam ot bam) against a reference fasta genome. (NB there may be mutiple aligned sequences in the sam/bam file).

bedtools can calculate coverage - but splits this up and I cant work out how to get a single % value for coverage - other than writing a script to parse the output. Ditto for samtools mpileup - this generates a huge ammount of information, whereas I am after a single percentage.

One basic way to do this is plot in IGV, extract consensus, then take the number of N's divided by reference genome length. There maybe existing tools that can do this in one command.



Unfortunately, nobody ever defines what they mean by 'coverage'. Can you please explain what you need such that another person's interpretation of what you mean by coverage can be made beyond a reasonable doubt?

Do you mean the percent of the reference genome that has at least 1 aligned read? - if 'yes', see my code, here: Determine % of reference genome covered by aligned SAM/BAM


before adding your answer.

Traffic: 1378 users visited in the last hour

Source link