10 ways to iterate from 0 to 1 with deciles
Marton Trencseni - Fri 14 May 2021 - Data
Suppose we want to iterate from 0 to 1, in steps of 0.1, like 0.0, 0.1, 0.2, ... 0.9
. This happens regularly in Data Science notebooks.
Note that iteration like this is usually exclusive at the end, ie. we expect it to stop at 0.9, the same way range(0, 10)
stops at 9.
Approach #1: Use range
, divide manually:
for i in range(10):
print(i/10)
> 0.0
> 0.1
> 0.2
> 0.3
> 0.4
> 0.5
> 0.6
> 0.7
> 0.8
> 0.9
The downside of this basic solution is that the iteration logic leaks into the print()
code. Also what if we want to iterate with steps of 0.42:
for i in range(10):
print(i*0.42)
In real code, it would be less clear that this is actually a clean iteration, and not some other reason we're multiplying.
Approach #2: Use arange
from numpy
The overall best practice is to use arange
from numpy
:
import numpy as np
for i in np.arange(0, 1, 0.1):
print(i)
> 0.0
> 0.1
> 0.2
> 0.30000000000000004
> 0.4
> 0.5
> 0.6000000000000001
> 0.7000000000000001
> 0.8
> 0.9
The only downside here is that arange
creates the whole array in memory, instead of yielding a Python iterator. This is a non-issue in real-life, but let's geek out and see what the options are to avoid creating the whole list/array in memory.
Approach #3: Use count()
from itertools
The closest built-in helper function Python has is count()
from itertools
. You supply start
and step
arguments, but it counts infinitely, so you have to take care of the stopping condition:
from itertools import count
for i in count(0, 0.1): # yields 11 times, a logical error!
if i >= 1: break
print(i)
> 0
> 0.1
> 0.2
> 0.30000000000000004
> 0.4
> 0.5
> 0.6
> 0.7
> 0.7999999999999999
> 0.8999999999999999
> 0.9999999999999999 <- due to floating point precision issues, it also returns ~1
This is very ugly because of the if
, but also, it has a big logical error: it yields a value 11 times instead of 10 times, because due to floating point precision issues, it also returns ~1, and doesn't stop at ~0.9. We'll return to floating point issues in a bit.
Approach #4: Use count()
and islice()
from itertools
We can use islice
to return the first int((stop-start)/step)
elements from the count()
iterator. This saves us the ugly manual if
.
from itertools import count, islice
def frange(start, stop, step):
return islice(count(start, step), start, int((stop-start)/step))
for i in frange(0, 1, 0.1):
print(i)
> 0
> 0.1
> 0.2
> 0.30000000000000004
> 0.4
> 0.5
> 0.6
> 0.7
> 0.7999999999999999
> 0.8999999999999999
In this case the floats work out in our favor and it yields 10 elements.
Approach #5: Use map()
Let's be functional and use map()
:
for i in map(lambda x: x / 10, range(10)):
print(i)
> 0.0
> 0.1
> 0.2
> 0.3
> 0.4
> 0.5
> 0.6
> 0.7
> 0.8
> 0.9
Approach #6: Use map()
in a cleaner way
We can put the map
in a function to make it cleaner:
def frange(start, stop, step):
return map(lambda x: start + x * step, range(int((stop-start)/step)))
for i in frange(0, 1, 0.1):
print(i)
> 0.0
> 0.1
> 0.2
> 0.30000000000000004
> 0.4
> 0.5
> 0.6000000000000001
> 0.7000000000000001
> 0.8
> 0.9
Approach #7: Write it yourself
Or we can just implement frange
ourselves:
def frange(start, stop, step):
while start < stop:
yield start
start += step
for i in frange(0, 1, 0.1): # yields 11 times, a logical error!
print(i)
> 0
> 0.1
> 0.2
> > 0.30000000000000004
> 0.4
> 0.5
> 0.6
> 0.7
> 0.7999999999999999
> 0.8999999999999999
> 0.9999999999999999 <- due to floating point precision issues, it also returns ~1
Approach #8: Epsilons
Some of the approaches had logical problems, because they yielded 11 times, also yielding ~1, due to floating point issues (the exact case when this is a problem is platform and argument dependent). One way to address this is to assume floating point precicion problems and explicitly handle ε differences to some user-specified precision. But this is such a bad idea, I will stop here.
Approach #9: Use decimal
The smart way to get rid of the floating point issues above is to use the Python standard decimal
library:
from decimal import *
for i in frange(Decimal('0'), Decimal('1'), Decimal('0.1')):
print(i)
> 0
> 0.1
> 0.2
> 0.3
> 0.4
> 0.5
> 0.6
> 0.7
> 0.8
> 0.9
Notice that the arguments to the Decimal
constructor are strings, not numbers. Passing in numbers yields:
for i in frange(Decimal(0), Decimal(1), Decimal(0.1)):
print(i)
> 0
> 0.1000000000000000055511151231
> 0.2000000000000000111022302462
> 0.3000000000000000166533453693
> 0.4000000000000000222044604924
> 0.5000000000000000277555756155
> 0.6000000000000000333066907386
> 0.7000000000000000388578058617
> 0.8000000000000000444089209848
> 0.9000000000000000499600361079
Note that you can set the precision for the decimal
library to cut off the ugly trailing noise, and then you can avoid using ugly strings in the constructor:
getcontext().prec = 6
for i in frange(Decimal(0), Decimal(1), Decimal(0.1)):
print(i)
> 0
> 0.100000
> 0.200000
> 0.300000
> 0.400000
> 0.500000
> 0.600000
> 0.700000
> 0.800000
> 0.900000
Approach #10: Alternative signatures
The floating point troubles point to the idea that frange(start, stop, step)
is maybe not the best signature, frange(start, step, count)
is cleaner:
def frange(start, step, num):
for i in range(num):
yield start
start += step
This way, we never have to worry about floating point issues yielding an extra value, even without using decimal
:
for i in frange(start=0, step=0.1, num=10):
print(i)
> 0
> 0.1
> 0.2
> 0.30000000000000004
> 0.4
> 0.5
> 0.6
> 0.7
> 0.7999999999999999
> 0.8999999999999999
Also, here backward iteration works as expected, unlike in some of the solutions given before:
for i in frange(start=0, step=-0.1, num=10):
print(i)
> 0
> -0.1
> -0.2
> -0.30000000000000004
> -0.4
> -0.5
> -0.6
> -0.7
> -0.7999999999999999
> -0.8999999999999999
Or:
def frange(start, step, num, backward=False):
if backward:
step *= -1
for i in range(num):
yield start
start += step
The downside here is that if we want to switch stepsize from 0.1
to 0.05
, we have to remember to also change num to get to 1.
There is also a third possible signature, linspace
from numpy
:
numpy.linspace(start, stop, num=50, endpoint=True, ...)
: return evenly spaced numbers over a specified interval.
Conclusion
In real-life, my recommendation:
- for
start, stop, step
signature usenumpy.arange()
- for
start, stop, num
signature usenumpy.linspace()
- for
start, step, num
signature use thefrange()
above.