I would strongly recommend to use specialized statistical software for this. Check for example
edgeR, the user manual has a section on CRISPR screens in paragraph 4.6: www.bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf Other standard software such as DESeq2 or limma can probably used as well, but I would try to follow a guided analysis if you are new to this. In any case, avoid custom approaches, there are simply too many pitfalls and results will probably be not robust.
The naive Wilcox test is capped towards p-values when it comes to low numbers of replicates, no matter how large the effect size is. This plus the necessary multiple testing correction makes it impossible to get significant results with n=2:
> wilcox.test(c(1,2), c(100,200)) Wilcoxon rank sum exact test data: c(1, 2) and c(100, 200) W = 0, p-value = 0.3333 alternative hypothesis: true location shift is not equal to 0 > wilcox.test(c(1,2), c(1000000,2000000)) Wilcoxon rank sum exact test data: c(1, 2) and c(1e+06, 2e+06) W = 0, p-value = 0.3333 alternative hypothesis: true location shift is not equal to 0
From what I understand the main advantage of non-naive methods is that they take dispersion into account which is especially important with a limited (low) number of replicates. It is obviously a difference if your results were like A vs B or like C vs D. The latter is way more reliable. Wilcox p-values would be the same though since ranks of in D and B vs C and A are always higher.
Also artificially large fold changes that arise from genes with low counts should be penalized as they are less reliable than FCs from genes with larger counts. edgeR does all of that.