Python types for Data Scientists - Part I

Marton Trencseni - Fri 08 April 2022 - Python

Introduction

Python supports types, and from 3.5, type annotation and typechecking. Let's see:

  • turn on type checking in ipython notebooks
  • how to write type hints
  • simple examples

The official documentation is here. The ipython notebook is up on Github.

MLFlow LinearRegression artifacts.

Type and type expressions

Let's use the built-in type function to get the types of some common Python expressions:

type(42), type(int('42')), type('hello'), type([]), type({}), type({1, 2, 3}), type(None)

Output:

(int, int, str, list, dict, set, NoneType)

Let's get more abstract:

type(int), type(type(42)), type(str), type(42) == type(int), type(int) == type(str)

Output:

(type, type, type, False, True)

Dynamic vs static typing

Python, by default, is dynamically typed. This means that variables can start of as an int, can become str or a list or NoneType later:

age = 42
age = 'hello'
age = [42, 'hello']
age = None

However, we can switch Python to static typing:

age: int

By default, Python treats the : int as an annotation and ignores it:

age: int = 42
age = 'hello'       # no problem since type checking is off
age = [42, 'hello'] # no problem since type checking is off
age = None          # no problem since type checking is off

We can load a static type checking module, and the type annotation will be enforced. First we need to pip install nb_mypy, then we can achieve type checking in ipython notebooks:

%load_ext nb_mypy
%nb_mypy On

Once we do this, we can get type checking each time a cell is executed:

age:int
age = 42            # okay
age = 'hello'       # not okay
age = [42, 'hello'] # not okay
age = None          # not okay

The output will be:

error: Incompatible types in assignment (expression has type "str", variable has type "int")
error: Incompatible types in assignment (expression has type "List[object]", variable has type "int")
error: Incompatible types in assignment (expression has type "None", variable has type "int")

Container types

Containers can also be typed:

l: list[int]
l = [1, 2, 3]    # okay
l = 1            # not okay!
l = [1, "hello"] # not okay!

Output:

error: Incompatible types in assignment (expression has type "int", variable has type "List[int]")
error: List item 1 has incompatible type "str"; expected "int"

Or:

l: list[int] = []
l.append(1)       # okay
l.append("hello") # not okay!

Output:

error: Argument 1 to "append" of "list" has incompatible type "str"; expected "int"

Functions

Functions can also carry type annotations:

def f(i:int) -> list[str]:
    return ['hello']

f(42)          # okay
f('hello')     # not okay
r: int = f(42) # not okay

Output:

error: Argument 1 to "f" has incompatible type "str"; expected "int"
error: Incompatible types in assignment (expression has type "List[str]", variable has type "int")

Classes

Class member variables:

class A:
    memberVariable: int
    def __init__(self, i:int) -> None:
        self.memberVariable = i
    def increase(self) -> int:
        self.memberVariable += 1
        return self.memberVariable

Then:

a = A(42)           # okay
a.increase()        # okay
a = A('hello')      # not okay
None + a.increase() # not okay

Output:

error: Argument 1 to "A" has incompatible type "str"; expected "int"
error: Unsupported operand types for + ("None" and "int")

Conclusion

In the next part, I will look at more complicated type expressions, such as Optional.