When using macs2 peak caller to ChIP-seq data, I pick up some pvalues from --cutoff-analysis and test the pvalues until I got a reasonable balance between accurate peak call and false positive calls of background noises. But I found the best pvalues differs between samples. For example, given p= 0.0001 was the best for CTCF.rep1, performing peak call -pvalue 0.0001 to CTCF.rep2 ended up losing most of peaks, and instead, p=0.01 turn out the best for CTCF.rep2. This is maybe due to the difference of coverage of each sample.

I have four samples of CTCF peaks. Control.rep1, control.rep2, treatment.rep1 and treatment.rep2. The impression by my eye suggested pvalues of 0.01, 0.001, 0.0001 and 0,00001 are the best, respectively. When peak calling by p=0.01, I got a lot of background noise in treatment.rep2 sample, so I want to use p=0.01 to control.rep1 and p=0.00001 to treatment.rep2, if possible.

I think there are some options.

  • Using a single p value to all of these samples.
  • p=0.01 for the control samples and p=0.0001 for the treatment samples. (the same p to each treatment group)
  • Or, maybe use the same p to each replicate.
  • Using four independent p values to each samples.

Is it OK to use different p values, or I should use a single cutoff p value to all of my samples?

