gravatar for rohitsatyam102

2 hours ago by

Hi All!

I am relatively new to DEXSeq. I was trying to follow a tutorial given at Bioconductor here

chr16 dexseq_prepare_annotation.py aggregate_gene 69320140 69408571 . - . gene_id "ENSG00000259900.5+ENSG00000272617.3+ENSG00000260371.1+ENSG00000132604.11+ENSG00000258429.2+ENSG00000157315.5+ENSG00000213380.15"
chr16 dexseq_prepare_annotation.py exonic_part 69320140 69321046 . - . transcripts "ENST00000564419.1"; exonic_part_number "001"; gene_id "ENSG00000259900.5+ENSG00000272617.3+ENSG00000260371.1+ENSG00000132604.11+ENSG00000258429.2+ENSG00000157315.5+ENSG00000213380.15"

It's weird to see TERF2 (Chromosome 16: 69,355,567-69,408,571) aggregate with RP11-343C2.9 because of partial overlap of one exon, as shown here. Wouldn't this bias the exon usage if we are interested in studying only exon usage of one gene.
How is the analysis affected if I consider overlapping/shared exon separately using "-r no" option?

Also, in DEXSeq, I am unable to perform gene subsetting using dxd = dxd[geneIDs( dxd ) %in% genesForSubset,]
I am not sure how to make the "geneIDsinsubset.txt" file properly. I have a .csv file with a single gene ensemble ID: ENSG00000132604.11. However running the above command, DEXSeq throws the following error

Error in `$<-.data.frame`(`*tmp*`, "dispersion", value = NA) :  replacement has 1 row, data has 0

Please help...



Source link