Problem with this error: EXITING because of INPUT ERROR: the file format of the genomeFastaFile

0

Hi.

I am trying to generate a index genome for my alignment, however I am running into a problem that isn't making any sense. This is the code in my slurm script:

#!/bin/bash

#SBATCH --partition=defq       # the requested queue
#SBATCH --nodes=1              # number of nodes to use
#SBATCH --tasks-per-node=1     # for parallel distributed jobs
#SBATCH --cpus-per-task=4      # for multi-threaded jobs
#SBATCH --mem-per-cpu=4G      # in megabytes, unless unit explicitly stated
#SBATCH --error=%J.err         # redirect stderr to this file
#SBATCH --output=%J.out        # redirect stdout to this file
#SBATCH [email protected]   # email address used for event notification
#SBATCH --mail-type=BEGIN,END,FAIL     # email on job start, end, and/or failure

# Load modules

module load STAR/2.7.3a

export RefDir=/mnt/scratch/c1818206/fastqs/merged_files/trimmedfiles/trimmedfiles_final

## Change --sjdbOverhang to length of your sequence data minus 1

STAR    --runThreadN ${SLURM_CPUS_PER_TASK} 
        --limitGenomeGenerateRAM 31G 
        --runMode genomeGenerate 
        --genomeDir  $RefDir/ 
        --genomeFastaFiles $RefDir/Mus_musculus.GRCm39.dna.primary_assembly.fa 
        --sjdbGTFfile $RefDir/Mus_musculus.GRCm39.104.gtf 
        --sjdbOverhang 75

However, I get this error when I run it on the server:

EXITING because of INPUT ERROR: the file format of the genomeFastaFile: /mnt/scratch/c1818206/fastqs/merged_files/trimmedfiles/trimmedfiles_final/Mus_musculus.GRCm39.dna.primary_assembly.fa is not fasta: the first character is '^_' (31), not '>'.
 Solution: check formatting of the fasta file. Make sure the file is uncompressed (unzipped).

Jun 17 17:53:03 ...... FATAL ERROR, exiting

This error makes no sense to me. I used to get it when I tried to run the code when the assembly and annotation files were zipped. I then used gunzip to unzip both, changed my code to the one above, but I still get this error. It makes no sense to me? Any help would be much appreciated.


RNA-seq


alignment


STAR

• 29 views

updated 2 hours ago by

33k

written 2 hours ago by

0



Source link