# Python Standard Library changes in recent years

With each major Python release, all the attention goes to the new language features: the walrus operator, dictionary merging, pattern matching. There is also a lot of writing about `asyncio`

and `typing`

modules — they are developing rapidly and are obviously important for the core team.

The rest of the standard library modules receive undeservedly little attention. I want to fix this and tell you about the novelties introduced in versions 3.8–3.10.

This is not an exhaustive list, of course. I write only about those changes that interested me personally. But since I am not too much different from the “average” Python backend developer, it is likely that you will also be interested. Let me know if I missed something.

The modules are in alphabetical order, so if you get bored with the first (little-known) ones, do not be discouraged — it gets more exciting further.

array • base64 • bisect • builtins • dataclasses • datetime • fractions • functools • glob • graphlib • itertools • math • random • shlex • shutil • statistics • zoneinfo

All new features and improvements in the article are accompanied by examples. You can try them in the playground or run locally. If you have an older local Python, run it using Docker:

```
$ docker run -it --rm python:3.10-alpine
```

## array

`array`

module provides compact typed numeric arrays. It is used much less frequently than the famous `list`

counterpart.

`array.index()`

methods finds the value in the array and returns the index of the found element. Now it supports optional `start`

and `stop`

parameters, which define the search interval (3.10+):

```
from array import array
arr = array("i", [7, 11, 19, 42])
idx = arr.index(11)
# idx == 1
idx = arr.index(11, 2)
# ValueError: array.index(x): x not in array
```

Contributed by: Anders Lorentsen • Zackery Spytz

## base64

`base64`

module encodes binary data into ASCII strings using Base16, Base32, and Base64 algorithms.

It received a couple of new functions: `b32hexencode()`

and `b32hexdecode()`

, which use an extended 32-character alphabet according to RFC 4648 (3.10+):

```
import base64
bytes = b"python is awesome"
base64.b32encode(bytes)
# b'OB4XI2DPNYQGS4ZAMF3WK43PNVSQ===='
base64.b32hexencode(bytes)
# b'E1SN8Q3FDOG6ISP0C5RMASRFDLIG===='
```

Contributed by: Filipe Laíns

## bisect

`bisect`

module works with sorted lists using binary search method. Main functions are:

```
import bisect
lst = [7, 11, 19, 42]
idx = bisect.bisect(lst, 12)
# idx == 2
bisect.insort(lst, 12)
# [7, 11, 12, 19, 42]
```

Since version 3.10, all module functions support the optional `key`

parameter. It is a function that returns the value of a list item. It is convenient to use if the elements cannot be compared directly:

```
import bisect
import operator
p1 = {"id": 11, "name": "Diane"}
p2 = {"id": 12, "name": "Bob"}
p3 = {"id": 13, "name": "Emma"}
key = operator.itemgetter("name")
people = sorted([p1, p2, p3], key=key)
# Bob, Diane, Emma
idx = bisect.bisect(people, "Dan")
# TypeError: '<' not supported between instances of 'str' and 'dict'
idx = bisect.bisect(people, "Dan", key=key)
# idx == 1
```

Contributed by: Raymond Hettinger

## builtins

`builtins`

module contains all the built-in functions and classes that programmers use without imports: `int`

, `list`

, `len()`

, `open()`

and the like.

```
import builtins
list is builtins.list
# True
len is builtins.len
# True
```

The **string** received `str.removeprefix()`

and `str.removesuffix()`

methods, which cut off the string head and tail respectively (3.9+):

```
s = "Python is awesome"
s.removeprefix("Python is ")
# 'awesome'
s.removesuffix(" is awesome")
# 'Python'
```

The **integer** received the `int.bit_count()`

method, which returns the number of ones in the binary representation of the integer (3.10+):

```
n = 42
bin(n)
# '0b101010'
n.bit_count()
# 3
```

The **dictionary** methods `dict.key()`

, `dict.values()`

and `dict.items()`

return view objects that reference dictionary data. Previously, it was impossible to get a reverse link to the dictionary from these objects, but now it can be done — through the `.mapping`

attribute (3.10+):

```
people = {
"Diane": 70,
"Bob": 78,
"Emma": 84
}
keys = people.keys()
# dict_keys(['Diane', 'Bob', 'Emma'])
keys.mapping["Bob"]
# 78
```

**Collection merging** `zip()`

function received the `strict`

parameter. It ensures that the sequences are of the same length (3.10+):

```
keys = ["Diane", "Bob", "Emma"]
vals = [70, 78, 84, 42]
pairs = zip(keys, vals)
list(pairs)
# [('Diane', 70), ('Bob', 78), ('Emma', 84)]
pairs = zip(keys, vals, strict=True)
list(pairs)
# ValueError: zip() argument 2 is longer than argument 1
```

Contributed by: Dennis Sweeney • Niklas Fiekas • Brandt Bucher

## dataclasses

`dataclasses`

module generates classes according to the specification.

Dataclasses can now use `slots`

, which are great for creating compact data objects with a fixed set of properties (3.10+).

Regular dataclass:

```
from dataclasses import dataclass
@dataclass
class Person:
id: int
name: str
diane = Person(id=11, name="Diane")
diane.__dict__
# {'id': 11, 'name': 'Diane'}
diane.salary = 70
# ok
```

Dataclass with slots:

```
from dataclasses import dataclass
@dataclass(slots=True)
class SlotPerson:
id: int
name: str
bob = SlotPerson(id=12, name="Bob")
bob.__dict__
# AttributeError: 'SlotPerson' object has no attribute '__dict__'
bob.__slots__
# ('id', 'name')
bob.salary = 78
# AttributeError: 'SlotPerson' object has no attribute 'salary'
```

Besides, the dataclass can now be forced to accept keyword-only parameters when creating an object (3.10+):

```
from dataclasses import dataclass
@dataclass(kw_only=True)
class KeywordPerson:
id: int
name: str
diane = KeywordPerson(id=11, name="Diane")
# ok
diane = KeywordPerson(11, "Diane")
# TypeError: KeywordPerson.__init__() takes 1 positional argument but 3 were given
```

Contributed by: Yurii Karabas • Eric V. Smith

## datetime

`datetime`

module (unsurprisingly) deals with date and time.

It received new `date.fromisocalendar()`

and `datetime.fromisocalendar()`

constructors, which create a date from the `(year, week, week_day)`

trio (3.8+):

```
import datetime as dt
day = dt.date(2022, 9, 13)
day.isocalendar()
# datetime.IsoCalendarDate(year=2022, week=37, weekday=2)
year, week, day = day.isocalendar()
next_day = dt.date.fromisocalendar(year, week, day+1)
# datetime.date(2022, 9, 14)
```

Besides, the `.isocalendar()`

method now returns a named `IsoCalendarDate`

instead of the regular tuple (3.9+). You can see it in the example above.

Contributed by: Paul Ganssle • Dong-hee Na

## fractions

`fractions`

module works with rational numbers.

It received the `Fraction.as_integer_ratio()`

method to return a fraction as a `(numerator, denominator)`

pair, thereby fixing the age-old shame of the usual `float`

(3.8+):

```
(0.25).as_integer_ratio()
# (1, 4)
(0.5).as_integer_ratio()
# (1, 2)
(0.2).as_integer_ratio()
# (3602879701896397, 18014398509481984)
# oopsie
```

```
from fractions import Fraction
Fraction("0.2").as_integer_ratio()
# (1, 5)
# so much better
```

To be fair, `decimal.Decimal`

learned to do this back in 3.6. But it’s still nice.

Contributed by: Lisa Roach • Raymond Hettinger

## functools

`functools`

module is a collection of higher-order auxiliary functions. One of them is `lru_cache()`

, which caches expensive calculations:

```
import functools
import time
@functools.lru_cache(maxsize=256)
def find_user(name):
# imitating slow search
time.sleep(1)
user = {"id": 11, "name": "Diane"}
return user
find_user("Diane")
# kinda slow
find_user("Diane")
# blazingly fast
```

Previously, it required to explicitly set the cache size. And now you can specify `@lru_cache`

without arguments, using the default size of `128`

(3.8+).

Besides, you can get the cache parameters (3.9+):

```
find_user.cache_parameters()
# {'maxsize': 256, 'typed': False}
```

If you don’t mind the memory usage, you can use the unlimited `@cache`

instead of `@lru_cache`

(3.9+).

New `@cached_property`

decorator caches the calculated object property (3.8+):

```
import functools
import statistics
class Dataset:
def __init__(self, seq):
self._data = tuple(seq)
@functools.cached_property
def stdev(self):
return statistics.stdev(self._data)
dataset = Dataset(range(1_000_000))
dataset.stdev
# kinda slow
dataset.stdev
# blazingly fast
```

And `@singledispatchmethod`

overloads the method depending on the parameter type (3.8+):

```
import functools
class Divider:
@functools.singledispatchmethod
def divide(self, dividend, divisor):
raise NotImplementedError("Do not know how to divide those")
@divide.register
def _(self, dividend: int, divisor: int):
return dividend // divisor
@divide.register
def _(self, dividend: str, divisor: int):
# this is really stupid, I know
newlen = len(dividend) // divisor
return dividend[:newlen]
divider = Divider()
divider.divide(10, 2)
# 5
divider.divide("hello world", 2)
# 'hello'
```

Smells like Java to me.

Contributed by: Raymond Hettinger • Carl Meyer • Ethan Smith

## glob

`glob`

module searches for files and directories that match the template.

Now thanks to the `root_dir`

parameter in `glob()`

and `iglob()`

functions you can specify the root directory of the search (3.10+):

```
import glob
import os
os.getcwd()
# '/'
glob.glob("*", root_dir="/usr")
# ['local', 'share', 'bin', 'lib', 'sbin', 'src']
```

It’s a small thing, but it’s nice.

Contributed by: Serhiy Storchaka

## graphlib

`graphlib`

module works with graphs. And you know what? This is a brand-new module! (3.9+)

So far, it has only one feature — topological graph sorting (an ordering of vertices such that for any `u → v`

, the vertex `u`

comes before `v`

):

```
from graphlib import TopologicalSorter
graph = {"Diane": {"Bob", "Cindy"}, "Cindy": {"Alice"}, "Bob": {"Alice"}}
# Alice → Bob → Diane
# ↳ Cindy ↗
sorter = TopologicalSorter(graph)
list(sorter.static_order())
# ['Alice', 'Cindy', 'Bob', 'Diane']
```

Contributed by: Pablo Galindo • Tim Peters • Larry Hastings

## itertools

`itertools`

module provides a variety of iterators for memory-efficient collection processing.

One of them is the `accumulate()`

function, which calculates the rolling aggregate. Now it allows the `initial`

parameter, which sets the initial value (3.8+):

```
import itertools
seq = [7, 11, 19, 42]
accumulator = itertools.accumulate(seq)
list(accumulator)
# [7, 18, 37, 79]
accumulator = itertools.accumulate(seq, initial=100)
list(accumulator)
# [100, 107, 118, 137, 179]
```

And the shiny new `pairwise()`

function traverses the collection and yields pairs of consecutive elements (3.10+):

```
import itertools
seq = [7, 11, 19, 42]
pairer = itertools.pairwise(seq)
list(pairer)
# [(7, 11), (11, 19), (19, 42)]
```

Contributed by: Lisa Roach • Raymond Hettinger

## math

`math`

module includes an abundance of mathematical functions.

There are a lot of news here:

`dist()`

calculates the Euclidean distance between points (3.8+);`perm()`

and`comb()`

count the number of permutations and combinations (3.8+);`lcm()`

computes the least common multiple (3.9+);`gcd()`

now computes the greatest common divisor for an arbitrary number of arguments (3.9+).

```
import math
math.dist((1,1), (4, 5))
# 5.0
math.perm(5, 2)
# 20
math.comb(5, 2)
# 10
math.lcm(9, 27, 60)
# 540
math.gcd(9, 27, 60)
# 3
```

And `prod()`

multiplies the sequence elements (3.8+):

```
import math
seq = range(3, 9)
math.prod(seq)
# 20160
```

Contributed by: Raymond Hettinger • Yash Aggarwal • Keller Fuchs • Serhiy Storchaka • Mark Dickinson • Ananthakrishnan • Pablo Galindo

## random

`random`

module handles random numbers.

New `randbytes()`

method generates a random byte string (3.9+):

```
import random
random.randbytes(4)
# b'\x8b\xd4\x8f\xc9'
```

Contributed by: Victor Stinner

## shlex

`shlex`

module splits the string into tokens according to the Unix command line rules.

And now it also joins the tokens back into the string — thanks to the `join()`

function (3.8+):

```
import shlex
tokens = ["echo", "-n", "Python is awesome"]
shlex.join(tokens)
# "echo -n 'Python is awesome'"
```

Contributed by: Bo Bayles

## shutil

`shutil`

module works with files and directories: copies, moves and deletes them.

Copying directories has now become a little more convenient — kudos to the `dirs_exist_ok`

parameter in the `copytree()`

function (3.8+). If it is on, the function allows the target directory to exist:

```
from pathlib import Path
import shutil
tmp = Path("/tmp")
src = tmp.joinpath("src")
src.mkdir()
src.joinpath("src.txt").touch()
# /tmp/src
# /tmp/src/src.txt
dst = tmp.joinpath("dst")
dst.mkdir()
# /tmp/dst
shutil.copytree(src, dst)
# FileExistsError: [Errno 17] File exists: '/tmp/dst'
shutil.copytree(src, dst, dirs_exist_ok=True)
# PosixPath('/tmp/dst')
```

Contributed by: Josh Bronson

## statistics

`statistics`

module handles mathematical statistics. Like `math`

, it has greatly improved in recent releases. Not `scipy`

yet, but it’s not the kindergarten version Python had in 3.4.

See for yourself:

`fmean()`

computes the arithmetic mean (like`mean()`

, only faster) (3.8+);`geometric_mean()`

computes the geometric mean (3.8+);`multimode()`

returns the modes (the most frequent values in the dataset), even if there are multiple ones (in contrast to`mode()`

) (3.8+);`quantiles()`

splits the dataset into quantiles and returns the cut points (3.8+).

```
import statistics
seq = list(range(1, 10))
statistics.fmean(seq)
# 5.0
statistics.geometric_mean(seq)
# 4.147166274396913
statistics.multimode(seq)
# [1, 2, 3, 4, 5, 6, 7, 8, 9]
statistics.multimode("python is awesome")
# ['o', ' ', 's', 'e']
statistics.quantiles(seq)
# [2.5, 5.0, 7.5]
```

`NormalDist`

describes the normal distribution of a random variable (3.8+):

```
from statistics import NormalDist
birth_weights = NormalDist.from_samples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5])
drug_effects = NormalDist(0.4, 0.15)
combined = birth_weights + drug_effects
round(combined.mean, 1)
# 3.1
round(combined.stdev, 1)
# 0.5
```

The module received Pearson `correlation()`

and `covariance()`

functions (3.10+):

```
import statistics
x = [1, 2, 3, 4, 5, 6, 7, 8, 9]
y = [9, 8, 7, 6, 5, 4, 3, 2, 1]
statistics.correlation(x, x)
# 1.0
statistics.correlation(x, y)
# -1.0
statistics.covariance(x, x)
# 7.5
statistics.covariance(x, y)
# -7.5
```

And even the `linear_regression()`

calculator (3.10+):

```
import statistics
movies_by_year = {
2000: 371,
2003: 507,
2006: 608,
2009: 520,
2012: 669,
2015: 708,
2018: 873,
2021: 403,
}
x = movies_by_year.keys()
y = movies_by_year.values()
slope, intercept = statistics.linear_regression(x, y)
year_2022 = round(slope * 2022 + intercept)
# 697
```

By the way, the `statistics`

module is also famous for its excellent documentation. Check it out.

Contributed by: Raymond Hettinger • Steven D’Aprano • Timothy Wolodzko

## zoneinfo

`zoneinfo`

module provides information about time zones around the world. Another new module! (3.9+)

Before the `zoneinfo`

appearance, Python had a single ascetic `timezone.utc`

time zone. Well, not anymore:

```
import datetime as dt
from zoneinfo import ZoneInfo
utc = dt.datetime(2022, 9, 13, hour=21, tzinfo=dt.timezone.utc)
# 2022-09-13 21:00:00+00:00
paris = utc.astimezone(ZoneInfo("Europe/Paris"))
# 2022-09-13 23:00:00+02:00
tokyo = utc.astimezone(ZoneInfo("Asia/Tokyo"))
# 2022-09-14 06:00:00+09:00
sydney = utc.astimezone(ZoneInfo("Australia/Sydney"))
# 2022-09-14 07:00:00+10:00
```

Contributed by: Paul Ganssle

## Summary

We have reviewed as many as 17 modules contributed by 27 devs — and this is without taking into account `asyncio`

, `typing`

and many other lower-level ones. As you can see, the standard library is actively developing. And the new features are quite reasonable. I hope you will find the described novelties useful!

I would also like to specifically thank the contributors for their amazing work:

- Carl Meyer for the
`functools.cached_property()`

decorator; - Dennis Sweeney for the
`str.removeprefix()`

and`str.removesuffix()`

methods; - Ethan Smith for the
`functools.singledispatchmethod()`

decorator; - Filipe Laíns for the
`base64.b32hexencode()`

and`base64.b32hexdecode()`

functions; - Lisa Roach for the
`Fraction.as_integer_ratio()`

method and`itertools.accumulate()`

improvements; - Niklas Fiekas for the
`int.bit_count()`

method; - Pablo Galindo for the whole
`graphlib`

module and`math.prod()`

function; - Paul Ganssle for the whole
`zoneinfo`

module; - Raymond Hettinger for lots of functions in the
`statistics`

module,`itertools.pairwise()`

function,`key`

parameter in the`bisect`

module and his community work; - Serhiy Storchaka and Yash Aggarwal for the combinatorics in the
`math`

module; - Timothy Wolodzko for the
`covariance()`

,`correlation()`

, and`linear_regression()`

functions in the`statistics`

module; - Victor Stinner for the
`random.randbytes()`

method.

*Follow @ohmypy on Twitter and subscribe by email to keep up with new posts* 🚀