# PHRED scores and sequence errors

Using my very basic probability skills, I will try to explain dsull comments with a little more detail.

You can calculate pretty easily the probability of the sequence being correct, i.e., no base is wrong, as (probability the base is correct) ^ #bases. So, for example, for a 100bp sequence with all bases Q20, the probability of the sequence being correct is `(0.99)^(100) = 0.366`, or 36.6% chance having no errors.

The probability of a sequence "being wrong" is one minus the probability of the sequence being correct. So, for the same example above of a 100bp sequence with all bases Q20, the probability of the sequence being wrong (i.e., containing one or more errors) is `1-0.366 = 0.634`, or 63.4% chance of containing at least one error.

Note there is only one way of a sequence is correct (all bases must be correct), but there are many ways a sequence can be wrong - one base can be wrong, two bases, and so on. The estimation from your question - `(0.01)^100` - is actually the probability of all bases of the sequence being wrong. 