gravatar for ccc

2 hours ago by

I'm looking at this ENCODE page Association Graph:

and the first pipeline the data goes through in the top has the steps listed "fastq concatenation", "read trimming", "alignment", and "pooling". What exactly is the purpose of these steps? Is the idea of fastq concatenation that you have multiple sequencings of the same sample, so to get a more robust sample, you should concatenate them? Then read trimming (from the biostar handbook it seems) is both trimming off the adapters and the "low quality sequences". Then alignment is made against the reference, which is why we get a BAM file.

Am I mistaken about any of this so far?

But then what is pooling?

Then what exactly is meant by filtering (in the next pipeline step)? It seems filtering has already been done with "read trimming". I can see how maybe there is some differences, and wondering what they are.


2 hours ago


Source link