Raw counts of scRNA-seq

1

Hello all

I need a PBMC data set from any solid tumor (patient with cancer)

I found this set which has 9 PBMC samples GSE114725

I looked at GSE114725_RAW.tar , GSE114725_rna_imputed.csv.gz and GSE114725_rna_raw.csv.gz

They say that the processed data are in supplementary files

I looked at the matrices of counts inside these files but I see numbers like 0.1 , 0.2

I am wondering which data type they are

Normalized?

Please somebody give me an idea

Thanks


scRNA-seq

• 113 views

updated 1 hour ago by

★

2.8k

written 16 hours ago by

★

3.9k

As the file names make it clear, there are two files, one containing raw counts and the other containing imputed counts.
The raw counts are integers, as can be easily checked:

> x <- read.csv("GSE114725_rna_raw.csv.gz")
> y <- as.matrix(x[,-(1:5)])
> y[1:5,1:5]
     A1BG A2M A4GALT AAAS AACS
[1,]    0   0      0    0    0
[2,]    0   0      0    0    0
[3,]    0   0      0    0    0
[4,]    0   0      0    0    0
[5,]    0   0      0    0    0
> max(y-round(y))
[1] 0
> min(y-round(y))
[1] 0

It is the imputed values that are not integers and indeed often negative. The imputed values are explained at some length in the published Cell paper that describes this dataset. If you want to understand what they are in detail you would obviously need to read the paper.


Login
before adding your answer.

Traffic: 1497 users visited in the last hour



Source link