gravatar for swbarnes2

2 hours ago by

United States

I have 4 samples; two related tissues from two different donors. I ran cellranger count on all four samples, and used cellranger aggr to combine all the data.

Then I gave the filtered matrix data from each sample to Seurat, (not the matrix data from the aggregation) and had it integrate the data.

The 10x aggr method puts each library in its own cluster. Seurat's integration puts all the cells from all the samples into one big cluster.

I was wondering if anyone had observed this before, or if anyone had an idea as to which UMAP is likely to be more reliable. I think that Seurat's algorithm is more sophisticated, but maybe the 10X people understand their data better, and their way is better for their libraries? Is there a way to change my command lines to make the two ways more similar?

10XGenomcs command lines

cellranger count --id=donor1_type1 --fastqs=/projects/Illumina/200310_NB551398_0049_AHCN2KBGXC/mkfastq/outs/fastq_path/HCN2KBGXC/donor1_type1/ --transcriptome=/projects/Illumina/W/10xGenomics/refdata-cellranger-1.1.0/GRCh38_96/GRCh38/ --localcores=30

 cellranger aggr --id=all_200319_aggregate --csv=all_200319_aggr.csv

Seurat R commands, taken from here: satijalab.org/seurat/v3.1/immune_alignment.html

data <- Read10X(data.dir = data_dir)
pbmc <- CreateSeuratObject(counts = data, project = "donor1_type1", min.cells = 3, min.features = 200)
pbmc <- NormalizeData(pbmc)
pbmc <- FindVariableFeatures(pbmc, selection.method = "vst", nfeatures = 2000)

immune.anchors <- FindIntegrationAnchors(object.list = list(donor1_type1, donor1_type2, donor2_type1, Donor2_type2), dims = 1:20)
combined.all <- IntegrateData(anchorset = immune.anchors, dims = 1:20)
rm(immune.anchors)
DefaultAssay(combined.all) <- "integrated"
combined.all <- ScaleData(combined.all, verbose = FALSE)
combined.all <- RunPCA(combined.all, npcs = 30, verbose = FALSE)
combined.all <- RunUMAP(combined.all, reduction = "pca", dims = 1:20)
combined.all <- FindNeighbors(combined.all, reduction = "pca", dims = 1:20)
combined.all <- FindClusters(combined.all, resolution = 0.5)



Source link