gravatar for sumeka

2 hours ago by

Hi, I am new to RNA-seq data analysis.
Would you mind giving me any advice?

I am planning to construct a cell_evaluation model using 3'end RNA-seq(quantSeq) not using RT-qPCR described below;

Step 1 (preparation of cell_type prediction model)

  1. RNAseq (quantSeq) library preparation of clone A and clone B (six biological replicates vs six biological replicates)
  2. Obtain around 10M read per sample by single-end reading with rapid mode on HiSeq2500
  3. Mapping to hg38 with STAR and measure raw count data with HTseq
  4. Eliminate the batch effect (between A and B) with ComBat-seq if necessary
  5. Normalise raw count data by the TMM method of edgeR (exact)
  6. Differentially expressed genes (FDR < 0.05) estimation by the edgeR (exact)
  7. Eliminate raw count DGE (raw count < 20)
  8. Prepare data_frame with CPM of each DGE vs cell_type evaluation (A = 1, B = 0) on R
    exp: Xy = data.frame(list[geneA:geneX], list[cell_type]])
  9. The best model for predicting the clone (A or B) is obtained from the bestglm function of R; bestglm(Xy, family=binomial(link="logit"), IC="AIC") . . .(*)
    --> Logistic multiple regression equation is obtained

Step2 (evaluation of the cell_type prediction model)

  1. Prepare randomized sample from another cohort
  2. Raw count data obtained and converted to CPM
  3. Each DEG's CPM was put in the equation and obtain result
  4. ROC analysis

Do I need technical replicates?

Are there any considerable confounds or batch effects?


modified 2 hours ago

2 hours ago


Source link