gravatar for ATpoint

1 hour ago by


Map the full sequencing results against your reference, then mark and remove duplicates with any of the standard tools such as MarkDuplicates from Picard or MarkDup from Samtools. Do not do any custom read sampling, this is arbitrary and not standard. PCR cycles should not be a factor you consider during analysis since this is not something you can objectively model or respect. A sample with more cycles can still have better quality and one with lower cycle numbers can still be crappy. Take the reads that are present after mapping and deduplicating, this is what you have. Multimappers are also typically excluded. People usually do not use bowtie anymore unless you have very short reads. Use bowtie2, it is a more recent replacement.

The actual QC starts after mapping imho. Downstream of mapping perform peak calling on each sample and then calculate FRiPs (fraction of reads per peak). That is nothing different than the percentage of reads overlapping with callable peaks per sample. It is strongly dependent on the peak caller and the way you calculate it but when using macs2 as a peak caller then FRiPs should be somewhat > 5% for TFs and most histone marks. Also check samples on a genome browser and see if you have clear separation between peaks and noise. Also consider performing principal component analysis to check if you have odd samples that cluster away from the other replicates. This only makes sense if you indeed have replicates per condition which I strongly recommend.

Normalization comes into place during differential analysis, not before that. Do not sample replicates to equal read numbers as this is not informative. You would need to respect differences in data quality and (between conditions) different library compositions. Also do not compare raw peak numbers as this is also a function of sequencing depth and data quality. If you want to compare samples (between experimental conditions) then perform differential analysis, e.g. with edgeR, reqiuring replicates.

ChIP-seq is a tricky assay as it depends on so many factors if your library prep is successful, most importantly antibody quality and specificinty. if you have further questions feel free to comment.


modified 1 hour ago

1 hour ago


Source link