Hi, I'm new to genome analysis and I need some advice on this.
I'm currently follow GATK best practice, now I'm stuck at the MergeBamAlignment step of picard. I'm using the data from www.internationalgenome.org/, the reference genome is also downloaded from International genome database.
My command:
MergeBamAlignment -ALIGNED aln/HG02142.mapped.ILLUMINA.bwa.KHV.low_coverage.20130415.bam -UNMAPPED /aln/HG02142.unmapped.ILLUMINA.bwa.KHV.low_coverage.20130415.bam -O test.bam -R refgen/hs37d5.fa.gz -TMP_DIR tmp
I got many warning, and didn't get the output file:
WARNING 2020-02-23 21:54:56 SamAlignmentMerger Exception merging bam alignment - attempting to sort aligned reads and try again: Underlying iterator is not queryname sorted: ERR233223.95759674 2/2 101b aligned to 1:47264561-47264660. > ERR233223.14389540 2/2 101b aligned to 1:47264588-47264671.
INFO 2020-02-23 21:54:56 SamAlignmentMerger Finished reading 28356 total records from alignment SAM/BAM.
WARNING 2020-02-23 21:54:56 SAMSequenceDictionary Found sequence entry for which tags differ: 1 and tag UR has the two values: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz and file:/data4t/KHV1000G/refgen/hs37d5.fa.gz. Using ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
WARNING 2020-02-23 21:54:56 SAMSequenceDictionary Found sequence entry for which tags differ: 2 and tag UR has the two values: ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz and file:/data4t/KHV1000G/refgen/hs37d5.fa.gz. Using ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/hs37d5.fa.gz
.... (many warnings in here, similar structure)
[Sun Feb 23 21:54:56 ICT 2020] picard.sam.MergeBamAlignment done. Elapsed time: 0.01 minutes.
Runtime.totalMemory()=886571008
To get help, see http://broadinstitute.github.io/picard/index.html#GettingHelp
Exception in thread "main" java.lang.IllegalStateException: Reads remaining on alignment iterator: ERR233223.100003813!
at picard.sam.AbstractAlignmentMerger.mergeAlignment(AbstractAlignmentMerger.java:558)
at picard.sam.SamAlignmentMerger.mergeAlignment(SamAlignmentMerger.java:186)
at picard.sam.MergeBamAlignment.doWork(MergeBamAlignment.java:358)
at picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:305)
at picard.cmdline.PicardCommandLine.instanceMain(PicardCommandLine.java:103)
at picard.cmdline.PicardCommandLine.main(PicardCommandLine.java:113)
What have go wrong in my case? I downloaded all of my files directly from international genomes database ftp, without any modification. Please advice me on this, thank you