This depends a lot on your goal. If you just want to show, across your data in general (or maybe between different groups), how methylation is distributed across genomic features, I would pool the information on the biological replicates, yes. But it is not necessary to do any operation (like you suggested, means across each CpG) for the plots. Simply concatenate all the data for the replicates. For example, using ggplot2 in a tidy data format, you will have to input something like this:

cpg sample  value   genomic_feature
cg1 A   0.2 promoter
cg2 A   0.8 exon
cg3 A   0.1 intergenic
cg1 B   0.3 promoter
cg2 B   0.9 exon
cg3 B   0.2 intergenic
…   …   …   …

Thus, when representing the boxplots, violinplots, separating by genomic feature, the data across the replicates will be pooled



Source link