gravatar for Kevin Blighe

2 hours ago by

Republic of Ireland

Hi, you need to create the targets file yourself, or you can just create it as a data-frame within the R coding environment itself.

The metadata associated with each GEO record will usually have all information that you need. However, to give you an idea, your targets file for an Affymetrix study would look like:

FileName                                    SampleID Group
SampleFiles/1_CS0911a_(HuGene-2_0-st).CEL   CS0911a   KN92
SampleFiles/10_CS0812d_(HuGene-2_0-st).CEL  CS0812d   KN92_WNT3A
SampleFiles/11_CS0812e_(HuGene-2_0-st).CEL  CS0812e   KN93_WNT3A
SampleFiles/12_CS0812f_(HuGene-2_0-st).CEL  CS0812f   KN93_WNT3A
SampleFiles/13_CS0801a_(HuGene-2_0-st).CEL  CS0801a   KN92
SampleFiles/14_CS0801b_(HuGene-2_0-st).CEL  CS0801b   KN92_WNT3A
SampleFiles/15_CS0801c_(HuGene-2_0-st).CEL  CS0801c   KN93_WNT3A
SampleFiles/16_CS1003a_(HuGene-2_0-st).CEL  CS1003a   KN92
SampleFiles/17_CS1003b_(HuGene-2_0-st).CEL  CS1003b   KN92
SampleFiles/18_CS1003c_(HuGene-2_0-st).CEL  CS1003c   KN92_WNT3A
SampleFiles/19_CS1003d_(HuGene-2_0-st).CEL  CS1003d   KN93_WNT3A
SampleFiles/2_CS0911b_(HuGene-2_0-st).CEL   CS0911b   KN92
SampleFiles/20_CS1003e_(HuGene-2_0-st).CEL  CS1003e   KN93_WNT3A
SampleFiles/3_CS0911c_(HuGene-2_0-st).CEL   CS0911c   KN92_WNT3A
SampleFiles/4_CS0911d_(HuGene-2_0-st).CEL   CS0911d   KN92_WNT3A
SampleFiles/5_CS0911e_(HuGene-2_0-st).CEL   CS0911e   KN93_WNT3A
SampleFiles/6_CS0911f_(HuGene-2_0-st).CEL   CS0911f   KN93_WNT3A
SampleFiles/7_CS0812a_(HuGene-2_0-st).CEL   CS0812a   KN92
SampleFiles/8_CS0812b_(HuGene-2_0-st).CEL   CS0812b   KN92
SampleFiles/9_CS0812c_(HuGene-2_0-st).CEL   CS0812c   KN92_WNT3A

I have not anonymised this data because these samples belong to a study of mine that is just accepted for publication (and that already has a GSE ID). I did not put the parentheses in the filenames.

You should be using the oligo package functions, by the way, something along the lines of:

library('limma')
library('oligo')
targetinfo <- readTargets('Targets.txt', sep = 't')
CELFiles <- list.celfiles('SampleFiles/', full.names = TRUE)
project <- read.celfiles(CELFiles)

# Background correct, normalize, and calculate gene expression
project.bgcorrect.norm.avg <- rma(project, background = TRUE, normalize = TRUE, target = 'core')

Nota Bene! - after you read in the data, please verify that the columns of project.bgcorrect.norm.avg perfectly align with whatever other metadata you are using.

Kevin



Source link