I have recently run an ATAC-seq on frozen mouse liver tissue samples (although I do not think that can be an excuse) and the data seems really bad and does not pass the QC according to the ENCODE standard. I wonder whether these data can still be used (that is to be used in the scientific article to support conclusion/ exploratory analysis)?

Here are some of the typical qc results (I do have 4 biological replicates for each group, but the results are similar):

  1. 18195969 * 2 of reads after filtered mitochondrial reads and deduplication
  2. Fraction of reads in NFR: 0.50796
  3. NFR / mono-nuc reads: 1.146738 (failed in QC)
  4. Fraction of Reads in universal DHS regions: 0.36985
  5. Fraction of Reads in blacklist regions: 0.0017758
  6. Fraction of Reads in promoter regions: 0.02336
  7. Fraction of Reads in enhancer regions: 0.34436
  8. NRF = Distinct/Total: 0.427141
  9. PBC1 = OneRead/Distinct: 0.381561
  10. PBC2 = OneRead/TwoRead:1.430665
  11. Peak region size (min/ 25%/ 50%/ 75%/ max): 150/ 169/ 224/ 292/ 1777
  12. TSS enrichment: 3.36877
  13. FRiP for macs2 raw peaks: 0.1
  14. FRiP for overlap peaks: 0.0278

Will analysis of these data lead to faulty conclusions? Or they will only hurt the sensitivity of the assay (e.g. some of the marginally perturbed regions will be masked by noise)?

It would be nice if you may give some suggestions on troubleshooting the experiment on how to improve the quality if I can (or have to) repeat the sequencing.

Thank you!

