# Randomness extractors: making fair coins out of biased coins

Marton Trencseni - Sun 26 May 2024 - Data

## Introduction

In a previous post titled Fair coin from biased coin, I looked at the problem of creating a uniform random coin given access to a biased coin. I looked at multiple approaches, and determined that they're actually all the same in some sense. Recently, while reading Turing award winner Avi Widgerson's excellent book Mathematics and Computation, I learned that this is a simple case of a more general problem of **deterministic randomness extraction**: given a stream of random input bits, which are potentially biased and correlated, create a stream of output bits which are uniform random, ie. a series of fair coin tosses. The output bit stream, like in the Von Neumann case, is allowed to have a lower bitrate, ie. we don't always emit output bits on input.

Using the language of entropy, the input bit stream has non-zero, but less than 1-bit of entropy per bit: $ 0 < H_{input} < 1 $. We want to create an output bit stream which does have 1-bit of entropy per bit, or close to it: $ H_{output} \approx 1 $. The price we pay is (potentially) emitting less bits, ie. if the input bit stream has some bitrate $ r_{input} \: bits/sec $, then for the output bit stream $ r_{output} < r_{input} $.

I will skip the exact mathematical definitions and theorems, and instead go step-by-step to build more sophisticated randomness extractors, with Python code samples. The code is up on Github.

## Base classes

First, let's define some Python base classes that capture the core concepts of our problem domain. A `BitSource`

is something that emits bits (represented as `int`

s here), while a `RandomnessExtractor`

is something which internally gets bits from a `BitSource`

, and sometimes emits a bit, sometimes returns `None`

:

```
class BitSource(ABC):
@abstractmethod
def next(self) -> int:
pass
class RandomnessExtractor(ABC):
@abstractmethod
def next(self) -> Optional[int]:
pass
```

Let's define a `BiasedBitSource`

:

```
class BiasedBitSource(BitSource):
def __init__(self, bias=None):
if bias is None:
bias = random()
self.bias = bias
def next(self) -> int:
if random() < self.bias:
return 0
else:
return 1
```

## Trivial case: fair coin from fair coin

Let's define a simple `PasstroughExtractor`

, which just returns the input bit stream:

```
class PassthroughExtractor(RandomnessExtractor):
def __init__(self, bit_source: BitSource):
self.bit_source = bit_source
self.bits = []
def next(self) -> Optional[int]:
bit = self.bit_source.next()
result = bit
print_bits(bit, result)
return result
```

The trivial application then is:

```
bit_source = BiasedBitSource(bias=0.5)
extractor = PassthroughExtractor(bit_source)
for _ in range(100):
result = extractor.next()
```

Let's see it in action. The first column is always the input bit stream, the second column is the output bit stream:

```
1 1
1 1
1 1
0 0
1 1
...
```

## von Neumann: fair coin from biased coin

Let's make the following two assumptions about the input bit stream:

- each bit in the input stream is an independent (potentially biased) coin toss
- each bit in the input stream has identical bias (say $p=0.7$ for
`1`

)

In other words, the input stream is made up of indepedent identical (iid) random variables. In this case, the von Neumann rule works:

- break the input stream into non-overlapping 2 bit pairs
- if the 2 bits are
`00`

or`11`

, don't emit - if the 2 bits are
`01`

, emit 1 - if the 2 bits are
`10`

, emit 0

This works because the probability for both `01`

and `10`

is $p(1-p)$. Interestingly, this is the best possible approach, whether we know the bias of the input bit stream or not. The code for the `VonNeumannExtractor`

is below:

```
class VonNeumannExtractor(RandomnessExtractor):
def __init__(self, bit_source: BitSource):
self.bit_source = bit_source
self.bits = []
def next(self) -> Optional[int]:
bit = self.bit_source.next()
if bit not in [0, 1]:
raise ValueError("Bit must be 0 or 1.")
self.bits.append(bit)
result = None
if len(self.bits) == 2:
if self.bits == [0, 1]:
result = 1
elif self.bits == [1, 0]:
result = 0
self.bits = [] # Reset for the next pair of bits
print_bits(bit, result)
return result
```

Using it on a biased source with unknown bias:

```
bit_source = BiasedBitSource()
extractor = VonNeumannExtractor(bit_source)
for _ in range(100):
result = extractor.next()
```

Prints something like:

```
0 .
1 1
1 .
0 0
0 .
0 .
...
```

## Varying bias and the parity extractor

In the above case, we assumed that the input stream is made up of independent, identically distributed bits. What if we give up the second condition, ie. we allow each bit in the input bit stream to have different bias? In this case, the Von Neumann trick will not work, as subsequent bits no longer have identical probabilities, so `01`

and `10`

no longer have equal probability. In this case, it can be shown that taking N bit blocks of the input stream, and returning the parity bit as the output stream reduces the bias, and as N gets larger, makes the output stream closer to an even coin toss.

Let's define an input stream with varying bias:

```
class VaryingBiasedBitSource(BitSource):
def __init__(self, min_cutoff=0, max_cutoff=1):
self.min_cutoff = min_cutoff
self.max_cutoff = max_cutoff
self.bias = random()
def next_bias(self, draw):
bias = draw**2
if random() < 0.5:
bias = 1 - bias
bias = max(self.min_cutoff, bias)
bias = min(self.max_cutoff, bias)
return bias
def next(self) -> int:
draw = random()
if draw < self.bias:
result = 0
else:
result = 1
self.bias = self.next_bias(draw)
return result
```

Note that there is no one right way to define the way the bits vary. I chose a specific rule to compute the next bias (with some functional and some random elements). Next, let's define the `ParityExtractor`

:

```
class ParityExtractor(RandomnessExtractor):
def __init__(self, bit_source: BitSource, N: int):
self.bit_source = bit_source
if N <= 0:
raise ValueError("N must be a positive integer.")
self.N = N
self.bits = []
def next(self) -> Optional[int]:
bit = self.bit_source.next()
if bit not in [0, 1]:
raise ValueError("Bit must be 0 or 1.")
self.bits.append(bit)
result = None
if len(self.bits) == self.N:
result = sum(self.bits) % 2
self.bits = [] # Reset for the next block of N bits
print_bits(bit, result)
return result
```

Let's try it:

```
bit_source = VaryingBiasedBitSource(min_cutoff=0.1, max_cutoff=0.7)
extractor = ParityExtractor(bit_source=bit_source, N=4)
for _ in range(100):
result = extractor.next()
```

Prints something like:

```
0 .
1 .
0 .
0 1
0 .
1 .
1 .
0 0
0 .
1 .
1 .
1 1
0 .
...
```

## Markov chain input stream and the Blum extractor

Let's now look at a case when the input bits are emitted by a Markov Chain process, ie. there are $k$ states, each state has transitions to two other states, with one transition having $p$ probability and emitting a 0 bit, the other having $1-p$ probability and emiting a 1 bit. Let's define `MarkovChainBitSource`

with an example:

```
class MarkovChainBitSource(BitSource):
def __init__(self, start_state: str):
self.current_state = start_state
@abstractmethod
def get_transitions(self, state: str) -> Dict[int, Tuple[str, float]]:
pass
def next(self) -> int:
transitions = self.get_transitions(self.current_state)
rand = random()
cumulative_probability = 0.0
for bit, (next_state, probability) in transitions.items():
cumulative_probability += probability
if rand < cumulative_probability:
self.current_state = next_state
return bit
raise RuntimeError("Probabilities do not sum to 1.")
class ExampleMarkovChainBitSource(MarkovChainBitSource):
def __init__(self):
super().__init__(start_state='A')
self.transitions = {
'A': {0: ('B', 0.5), 1: ('C', 0.5)},
'B': {0: ('A', 0.7), 1: ('C', 0.3)},
'C': {0: ('A', 0.4), 1: ('B', 0.6)},
}
def get_transitions(self, state: str) -> Dict[int, Tuple[str, float]]:
return self.transitions[state]
```

Further, let's assume that our randomness extractor has access to the internal state of the Markov chain. In this case, the problem can be reduced to the Von Neumann case: for each state, we need to remember the last input bit the Markov chain produced when we were in this state the last time, take the current bit produced exiting this state, and apply the Von Neumann rule. Implementing this in Python is surprisingly simple:

```
class BlumExtractor(RandomnessExtractor):
def __init__(self, bit_source: MarkovChainBitSource):
self.bit_source = bit_source
self.state_bits = defaultdict(list) # Stores the last bits for each state
def next(self) -> Optional[int]:
current_state = self.bit_source.current_state
bit = self.bit_source.next()
self.state_bits[current_state].append(bit)
result = None
if len(self.state_bits[current_state]) == 2:
bits = self.state_bits[current_state]
if bits == [0, 1]:
result = 1
elif bits == [1, 0]:
result = 0
self.state_bits[current_state] = []
print_bits(bit, result)
return result
```

Let's try it:

```
bit_source = ExampleMarkovChainBitSource()
extractor = BlumExtractor(bit_source)
for _ in range(100):
result = extractor.next()
```

Prints something like:

```
0 .
1 .
1 .
0 0
1 1
0 0
0 .
0 .
1 1
...
```

## General case

Can we produce perfectly uniform bits from a biased input source that has *some* randomness (so it's not a deterministic process), but the bits may be correlated, and we don't know how? The answer is no, if we don't know anything about the structure of the input stream, there is no general way to create even a single perfectly uniform bit.

## Conclusion

An interesting test for this code would be to apply statistical tests on the generated output stream to test how random they are. I will do this in a subsequent post.