Will this de novo transcriptome assembly be useful in looking at RNAseq differential expression?


I have a de novo plant transcriptome assembly with the following stats from a company. Can it be used to evaluate rnaseq differential expression data? I have csv files with expression data and I am wondering if I should start looking at this data or if I need to improve the transcriptome assembly myself. There is no genome.

contigs: 1499698
smallest contig: 201
largest contig: 13777
n_bases: 787989217
mean_len: 525.43193
n_under_200: 0
n_over_1K: 177786
n_over_10k: 25
n_with_orf: 246891
mean_orf_percent: 65.70842
n90: 236 
n70: 354
n50: 738
n30: 1502
n10: 2862
gc: 0.44474
bases_n: 0
proportion_n: 0.0
score: NA
optimal_score: NA
cut_off: NA
weighted: NA







It's hard to tell from these metrics. I'd suggest downloading the Oyster River Protocol and running the included version of TransRate with your raw data and the assembled contigs. This will give you the Transrate Assembly Score, which is a measure for actual read support of the assembled sequences. For biological completeness of the assembled sequences, you could try and run BUSCO with e.g. the embryophyta dataset.

If your assembly has a rather low TransRate Score (say < 0.15), and is missing a huge amount of BUSCOs, you may be able to get a better Assembly yourself. What ploidy level does your organism have? What kind of data do you have (Illumina? stranded PE, PE, SE)? How much data do you have (millions of reads? hundreds of millions?). All these things would be important to consider a potential assembly.

One thing though: The number of contigs looks slightly suspicious, especially since only 1/5th of your contigs seem to have an ORF. However, if the assembly is of otherwise good quality, this may be a non-problem if you aim at gene-level DE analysis (since you can aggregate counts on a gene-level).

Edit: Thanks to the reformatting done by GenoMax, I now see that this is a TransRate report. Do you know why they did not include the read-based assessment? Which company did your assembly if I may ask?

before adding your answer.

Traffic: 1383 users visited in the last hour

Source link