# Fair coin from biased coin

Marton Trencseni - Tue 22 March 2022 - Math

## Introduction

Recently I saw a fun puzzle on Hacker News: **given a biased coin, construct a fair coin.** Ie. you're allowed to toss a biased coin repeatedly and record the sequence of results (like `001...1`

, and the goal is to return 0/1 with 50%-50% probability, so use the biased coin to simulate a fair coin. I quickly scanned the article and comments to get some ideas, but I wanted to work through solution(s) myself. The code is up on Github.

Let's assume the biased coin comes up `1`

with $p$ probability and `0`

with $q=1-p$ probability.

First, some helper functions:

```
def coin_flip(bias=0.5):
if random() < bias:
return 1
else:
return 0
def make_biased_coin(bias=None):
if bias is None:
bias = random()
print(f'Bias = {bias:.3f}')
return lambda: coin_flip(bias)
```

## von Neumann's probabilistic solution

The most elegant solution is from von Neumann: toss the biased coin twice. The 4 possible outcomes:

`00`

: probability $P=p^2$`11`

: probability $P=q^2$`01`

: probability $P=pq$`10`

: probability $P=pq$

Observe that the last 2 probabilities are the same. Algorithm: if the result is `00`

or `11`

, repeat the procedure. If the result is `01`

, return `0`

, if the result is `10`

, return `1`

.

In code:

```
def probabilistic(biased_coin):
while True:
s, w = biased_coin(), biased_coin()
if s == 0 and w == 1:
return 1
elif s == 1 and w == 0:
return 0
```

## Statistical solution

I wanted to come up with a solution of my own for this puzzle. I thought, first let's measure (estimate) the bias of the biased coin, and then somehow correct for it. Let's flip the coin $N/2$ times first, take the mean of the flips, let's call this $\hat{p_1}$, our first estimate of $p$. Then, do another $N/2$ flips (for a total of $N$ flips) to get $\hat{p_2}$, our second estimate of $p$. We can pretend that the first measurement gets the true value of $p$, and then the second measurement varies around $p$ in a symmetrical guassian way, so $P(\hat{p_2} > \hat{p_1}) = P(\hat{p_2} < \hat{p_1})$.

What if get unlucky, and $\hat{p_2} = \hat{p_1}$? We repeat the process and try again. In code:

```
def statistical(biased_coin, N):
while True:
num_flips = int(N/2.0)
s = sum([biased_coin() for _ in range(num_flips)])/num_flips
num_flips = N - num_flips
w = sum([biased_coin() for _ in range(num_flips)])/num_flips
if s < w:
return 1
elif s > w:
return 0
```

## Permutations solution

This is based on a comment I saw on Hacker News. Flip the biased coin $N$ times to get an $N$ long sequence of bits. Count the number of 1s in the sequence, let's say it's $k$. The probability of getting this specific sequence is $P=p^k q^{N-k}$. We notice that all sequences with $k$ 1s have this probability. So if we can somehow enumerate all such sequences, we can order them lexicographically, and we can divide them into two sets of equal cardinality: pick the middle bit sequence, and take the ones smaller than the middle and the ones bigger than the middle bit sequence.

Implementing this is trivial with one more restriction: let's only consider bit sequences such that the bit sequence is not equal to its reverse (ie. if the bit sequence is `01011`

then it's reverse is `11010`

). In this case each bit sequence has exactly one pair, its reverse, and we can create a trivial ordering between these pairs. If we get unlucky, and the bit sequence is the same as its reverse, we repeat the process and try again. In code:

```
def permutations(biased_coin, N):
while True:
inst = ''.join([str(biased_coin()) for _ in range(N)])
revr = inst[::-1]
if inst > revr:
return 1
elif inst < revr:
return 0
```

## All 3 solutions are the same

For the second and third solution, what should $N$ be? For the statistical solution, first I thought that the higher $N$, the more accurately it approximates a fair coin. But, this is not the case. All of the solutions are perfectly accurate at all $N$:

```
def experiment(num_flips, num_experiments, coin_type, coin_func):
start = time()
means = [mean([coin_func() for _ in range(num_flips)]) for _ in range(num_experiments)]
m, s = mean(means), stdev(means)
elapsed = (time() - start) * 1000
print(f'{coin_type}: mean = {m:.3f}, stdev = {s:.3f}, elapsed = {elapsed:.0f} msec')
num_experiments = 1000
num_flips = 1000
biased_coin = make_biased_coin(0.2) # fix bias at 0.2 for measurement
experiment(num_flips=num_flips, num_experiments=num_experiments, coin_type='Fair coin',
coin_func=lambda: coin_flip())
experiment(num_flips=num_flips, num_experiments=num_experiments,
coin_type=f'Probabilistic coin',
coin_func=lambda: probabilistic(biased_coin))
for N in [2, 50, 100]:
experiment(num_flips=num_flips, num_experiments=num_experiments,
coin_type=f'Statistical coin (N={N})',
coin_func=lambda: statistical(biased_coin, N=N))
for N in [2, 50, 100]:
experiment(num_flips=num_flips, num_experiments=num_experiments,
coin_type=f'Permutations coin (N={N})',
coin_func=lambda: permutations(biased_coin, N=N))
```

Yields something like:

```
Bias = 0.200
Fair coin: mean = 0.501, stdev = 0.016, elapsed = 629 msec
Probabilistic coin: mean = 0.500, stdev = 0.016, elapsed = 1815 msec
Statistical coin (N=2): mean = 0.500, stdev = 0.016, elapsed = 5293 msec
Statistical coin (N=50): mean = 0.500, stdev = 0.016, elapsed = 13198 msec
Statistical coin (N=100): mean = 0.500, stdev = 0.015, elapsed = 24275 msec
Permutations coin (N=2): mean = 0.500, stdev = 0.016, elapsed = 4569 msec
Permutations coin (N=50): mean = 0.499, stdev = 0.016, elapsed = 18403 msec
Permutations coin (N=100): mean = 0.499, stdev = 0.016, elapsed = 35645 msec
```

Given that, we should run them at the minimal $N$, which is $N=2$ coin flips. Then, if we think about it, the statistical and permutations based solutions trivially reduce to the von Neumann solution at $N=2$ coin flips:

- statistical: flip the coin once, it's either 0 or 1, so the first mean $\hat{p_1}$ is either 0 or 1. Flip it a second time to get $\hat{p_2}$. The conditions of $\hat{p_1} < \hat{p_2}$ and $\hat{p_1} > \hat{p_2}$ reduce to the
`01`

and`10`

cases (probability $P=pq$), and we flip again in the`00`

and`11`

case. - permuations: there are 4 possible outcomes of 2 flips.
`00`

and`11`

are their own reverse, so we flip again, in the other two cases of`01`

and`10`

we return 0 or 1 (probability $P=pq$).

## Conclusion

I had great fun thinking through this trivial puzzle.