Hi, I have multiple sample bams which are split over two lanes.
e.g. for 2 samples:

flowcells: ["CCGB1ANXX-",  "CCHNTANXX-"]
samples: ["3666-02-45-1_S2", "3666-03-45-1_S3"]
lanes: ["L005", "L006"]

I am trying to use snakemake to merge these bams. The problem I have is that it is not treating the sample ids separately in the merge rule but rather reading all sample bams and merging into one (for how ever many input samples there are so for two samples it is repeating the 'wrong' rule twice, for three samples, three times etc). How do I correct the rule so it only merges bams with the same sample id? Also, I have actually sorted these before this rule - should I sort after merging?
My code is:

rule mergeBams:
input:
    bams = expand(intermediateDirectory + "{flowcell}{sample}_{lane}.unsorted.bam",flowcell = flowcells, sample = samples, lane=lanes)
output:
    pileup = intermediateDirectory + "{sample}.merged.bam"
log:
    "logs/{sample}.germlinePileup.log"
threads:
    8
resources:
    mem = "8GB",
    time = "10:00:00"
container:
    singularityContainers + "samtools-1.9.simg"
shell:
    "samtools merge [email protected] {threads} {output.pileup} {input.bams}"

The std output from snakemake when running (it runs the rule twice):

samtools merge [email protected] 8 3666-03-45-1_S3.merged.bam CCGB1ANXX-3666-02-45-1_S2_L005.unsorted.bam CCGB1ANXX-3666-02-45-1_S2_L006.unsorted.bam CCGB1ANXX-3666-03-45-1_S3_L005.unsorted.bam CCGB1ANXX-3666-03-45-1_S3_L006.unsorted.bam CCHNTANXX-3666-02-45-1_S2_L005.unsorted.bam CCHNTANXX-3666-02-45-1_S2_L006.unsorted.bam CCHNTANXX-3666-03-45-1_S3_L005.unsorted.bam CCHNTANXX-3666-03-45-1_S3_L006.unsorted.bam

.....

samtools merge [email protected] 8 3666-02-45-1_S2.merged.bam CCGB1ANXX-3666-02-45-1_S2_L005.unsorted.bam CCGB1ANXX-3666-02-45-1_S2_L006.unsorted.bam /CCGB1ANXX-3666-03-45-1_S3_L005.unsorted.bam CCGB1ANXX-3666-03-45-1_S3_L006.unsorted.bam CCHNTANXX-3666-02-45-1_S2_L005.unsorted.bam CCHNTANXX-3666-02-45-1_S2_L006.unsorted.bam CCHNTANXX-3666-03-45-1_S3_L005.unsorted.bam CCHNTANXX-3666-03-45-1_S3_L006.unsorted.bam



Source link