Hi all,
I am relatively new to the field. I would be very grateful to have second opinion on below options from someone with STAR and HTSeq experience (particularly in the context of AT or plant species).

STAR
--runMode genomeGenerate

--genomeDir ...

--genomeFastaFiles ./Arabidopsis_thaliana.TAIR10.dna.toplevel.fa

--sjdbGTFfile ./Arabidopsis_thaliana.TAIR10.50.gtf

--sjdbOverhang 74
## My ReadLength-1

--genomeSAindexNbases 12
## log2(GenomeLength)/2 - 1=12.41

STAR

--genomeDir ./ATGenoIndices

--readFilesIn ....

--outFileNamePrefix ...

--outSAMtype BAM Unsorted

--outFilterMultimapNmax 20

--alignSJoverhangMin 8

--alignSJDBoverhangMin 8

--outFilterMismatchNmax 8

--alignIntronMin 35

--alignIntronMax 2000
## 99.3% of introns in AT are below this size based on doi:10.3390/genes8080200

--alignMatesGapMax 100000

--readFilesCommand zcat

htseq-count

-f bam

-s no

-t exon

-i gene_id

./...Aligned.out.bam

./Arabidopsis_thaliana.TAIR10.45.gtf

> ...htseq.txt

I have been trying to find the optimum settings using available literature, however the variety has been overwhelming. I eventually ended up pasting together above options from few references that appeared to be more in line with the analysis I am hoping to perform ( Which is simple DE analysis at gene level.).

Many thanks for your time and help in advance,



Source link