gravatar for Andrés Ribone

3 hours ago by

Hi,

I have an issue regarding using Salmon and Deseq2 with mixed (paired-end and single-end) read libraries.

My libraries were originally paired end, but I quality-trimmed them, resulting in a lot of single-end reads that I don't want to throw away.
I want to use Salmon to quantify expression. But since Salmon can't operate with both types of libraries at the same time, I ended up quantifying single- and paired-end reads separately, and then adding the read counts from same sample for each transcript.

salmon quantmerge --column numreads -o cohort95_nr_quant.sf --quants Sample*_quant 

python3 -c "
import pandas as pd
dfs=pd.read_csv('quants/cohort95_nr_quant.sf',sep='t')
ndf=pd.DataFrame(columns=['Name'])
ndf['Name']=dfs['Name']
libs=['Sample'+str(num) for num in range(1,95)]
for lib in libs:
    ndf[lib]=dfs[lib+'.paired_quant']+dfs[lib+'.single_quant']
ndf.to_csv('quants/cohort95_summed_nr_quant.sf',index=False,header=True,sep='t')
"

So I ended up with a file like:

Name    Sample1     Sample2     Sample3     ...
Transcript1     4811.874    11930.11    7938.97
Transcript2     34.0    79.0    104.0 
Transcript3     229841.3    262170.9    222405.4
Transcript4     0.0     11.0    6.0
Transcript5     0.0     0.0     0.0

Now I want to use Deseq2. I'm following the tximport tutorial for going from transcripts to genes, but I don't know how to use the file above. tximport() only takes original quant.sf files from salmon as long as I understand.

What can I do?



Source link