In a previous post, I described how our intuition about probability can lead us astray, using an example dealing with the average number of tosses needed of a fair coin to generate specific sequences of heads and tails. I provided an informal argument as to why the pattern 'HTH' took more tosses, on average, to generate than 'HTT', and I also provided the results of a simulation demonstrating this. In this post, I'll discuss the mathematical proof of this, as described in a paper by V. C. Hombas, though in a manner that is (hopefully) more suitable for a lay audience.
First, some definitions and notations. Let X be the number of tosses until the pattern of interest (either 'HTH' or 'HTT') appears; the average number of tosses for this pattern to appear (or more formally, the "expectation value of X") is then written as 'E[X]'. The concept of conditional probability is also important; written 'P(A|B)', it asks what the probability of event A is given the fact that event B has already occurred. In this discussion, then, the notation 'E[X|H]' asks what the average number of tosses for generating pattern X is, given the fact that the first toss was a head. Similarly, the notation 'E[X|HT]' asks what the average number of tosses for generating pattern X is, given the fact the first two tosses were a head and a tail, and so on.
With that, let's calculate what the E[X] is for the pattern X = 'HTH'. With a fair coin, we are equally likely to get a head as a tail on any toss (i.e., P(H) = P(T)), giving us an equation for the first toss of our search:
E[X] = 0.5 E[X|H] + 0.5 E[X|T]
If we is toss a tail, then we count a one toss "penalty" and restart our search for the 'HTH' pattern; mathematically speaking, E[X|T] = 1 + E[X]. Substituting this expression into the last one, we get our first numbered equation:
(1) E[X] = 0.5 E[X|H] + 0.5 (1 + E[X])
The next step is to evaluate E[X|H], which is the expectation value of X once we have the first head in the 'HTH' pattern. Again, because P(H) = P(T),
E[X|H] = 0.5 E[X|HT] + 0.5 E[X|HH]
If we toss 'HH', we must start the search over again; however, we can use the second head we tossed as the beginning of the 'HTH' pattern and only have to pay a one toss "penalty". Thus, E[X|HH] = (1 + E[X|H]), and we get our second numbered equation:
(2) E[X|H] = 0.5 E[X|HT] + 0.5 (1 + E[X|H])
Again using the fact that P(H) = P(T), we expand E[X|HT] as
E[X|HT] = 0.5 E[X|HTH] + 0.5 E[X|HTT]
Since 'HTH' is what we're looking for, we count the number of tosses since we started searching and say E[X|HTH] = 3. If we toss 'HTT', we have to start searching for the pattern from the beginning, paying a three toss "penalty" in the process; so, E[X|HTT] = 3 + E[X], and
(3) E[X|HT] = 1.5 + 0.5 (3 + E[X])
Substituting equation (3) into equation (2) into equation (1) and working through the algebra, we find E[X] = E[HTH] = 10, which matches the result we got from the simulation performed in the previous post.
Finding E[X] for X = 'HTT' is the same up to the last step of expanding E[X|HT]. Here, E[X|HTT] = 3 since we are looking for 'HTT'. E[X|HTH] forces a restart of the search from the last head along with a two toss "penalty" for the initial 'HT'; so, E[X|HTH] = 2 + E[X|H], and
E[X|HT] = 1.5 + 0.5 (2 + E[X|H])
Making the substitutions into equations (1) and (2) and crunching the algebra, we find E[X] = E[HTT] = 8, again matching the results from the simulation.
Since E[HTT] < E[HTH], we have proved that on average, the pattern 'HTT' should appear before 'HTH' when tossing a fair coin.
No comments:
Post a Comment