gravatar for banerjeeshayantan

2 hours ago by

I have two sequenced RNA seq samples (bam files generated using minimap2). My goal is prepare the input for the ballgown tool in order to perform gene expression analysis. The following are the commands that I used.
First, for each sample, I assembled the RNA seq segments using StringTie and then since the bam files didn't have an "XS" tag for tablemaker to run, I converted them to sam and used that as an input to the tablemaker tool.

./stringtie /home/abc/sample1.sorted.bam -o sample1.gtf
samtools view -H sample1.sorted.bam -o sample1.sam
./tablemaker -q -W -G /home/abc/sample1.gtf -o /home/abc/ /home/abc/sample1.sam

I did a similar thing (same three steps) for sample2. However, I am unsure that I am giving the right input to the tablemaker tool. The github page to the tablemaker tool lists the following command:

tablemaker -p 4 -q -W -G merged.gtf -o sample01_output sample_01/accepted_hits.bam

And the description states:

-W and -G merged.gtf are required. The -W tells the program to run in tablemaker mode (rather than Cufflinks mode), and the -G argument points to the assembly GTF file, which gives the assembled transcripts' structures. For Cufflinks users, often this is the merged.gtf output from Cuffmerge.

How am I supposed to prepare merged.gtf ? Is it same as the assembled transcripts that I obtained using the StringTie command for each sample? Or do I need to merge the gtf files for both samples using Cuffmerge and give that as an input?

Source link