24 November 2007

A simple probability puzzle,

A simple probability puzzle, from Doubting Toshuo via Anarchaia.

Paraphrasing a bit: imagine we flip a fair coin until either THH or THT appear on three consecutive flips. Which is more likely to come up first? Alternatively, which has the smaller expected time to its first appearance? (These are different questions; the first is "which wins the race" and the second is "which has the faster average time". They don't necessarily have the same answer in general.)

The answer to both is THH. This is analyzed in Flajolet and Sedgewick's forthcoming book Analytic Combinatorics, section I.4.2. If you want to figure out either the probability that THH appears before THT or the distribution of the first time at which they appear, it's a basically routine problem in the combinatorics of words.

Most people say that they're equally likely to come up first, and that they have the same expected time to the first appearance. This is because they're confusing this problem with a closely related problem: which pattern appears more often in the long run? And both patterns have probability 1/8 of appearing in the positions n through n+2 for any n; the expected value of the number of appearances of THH, or of THT, which start by position n is in fact n/8. But patterns like THT, which can overlap themselves (the technical term is "autocorrelation"), tend to occur in clusters; for example, we can get three occurrences of THT in just seven coin flips: THTHTHT -- but to get three occurrences of THH takes at least nine coin flips. As Flajolet and Sedgewick write:
On the other hand, patterns of the same length have the same expected number of occurrences, which is puzzling. The catch is that, due to overlaps of [pattern 1] with itself, occurrences of [pattern 1] tend to occur in clusters, but, then, clusters tend to be separated by wider gaps than for [pattern 2]; eventually, there is no contradiction.

They use different patterns, but the same effect occurs for their patterns as for Toshuo's. There might be a halfway decent bar bet here; unfortunately most of the people I would make a bar bet with are my friends, and my friends know I'm a probabilist, so they'd be suspicious. Especially since I've written about this here.

4 comments:

I. J. Kennedy said...

; Inputs: A and B are bit-strings of length N,
; right-justified in EAX and EBX respectively;
; 1 represents heads and 0 represents tails.
; N ≤ 30 is in ECX, and A ≠ B.
;
; Output: ESI/EDI is the probability that A
; precedes B in a sequence of flips.

Prob:
PUSH EAX
PUSH EBX
MOV EBX, EAX
CALL Inner
MOV ESI, EDX
POP EBX
CALL Inner
SUB ESI, EDX
MOV EAX,EBX
CALL Inner
MOV EDI, EDX
POP EBX
CALL Inner
SUB EDI, EDX
ADD EDI, ESI
RET

Inner:
PUSHAD
NEG ECX
SHL EBX, CL
MOV EDI, 1
MOV ESI,80000000H
XOR EBP,EBP
next: ROR EAX, 1
MOV EDX, EAX
XOR EDX, EBX
AND EDX, ESI
JNZ skip
ADD EBP, EDI
skip: SAR ESI, 1
SHL EDI, 1
INC ECX
JNZ next
MOV 20[ESP], EBP
POPAD
RET

Anonymous said...

I think that in the particular case you mention there is a tie for "which wins the race" (though you are completely correct regarding expected number of flips to first match).

The proof is basically that to complete THH or THT on flip N you must be in state TH on flip N-1. At this point it is equally likely that you flip heads or tails. (At no other point do you have any chance of hitting either pattern, and once you are at that point past history is irrelevant.)

dan said...

You might be interested in one application of this sort of thing, which is to the area of seeding of local alignments (the heads and tails are whether corresponding DNA positions are the same or different).

(this might amuse, or it might not...)

Anonymous said...

Hello,nice post thanks for sharing?. I just joined and I am going to catch up by reading for a while. I hope I can join in soon.