Fluent Python - My Private Notes | Gaëtan Grond

These are my personal notes from reading Fluent Python. They’re a bit scattered, but I’ve found them useful for understanding some of Python’s more advanced features, like decorators, generators, and defensive programming.

I’m sharing them here in case they’re helpful to others working through similar topics or if you’re considering reading the book.

Unpacking Arguments in Python

Another example of unpacking is prefixing an argument with * when calling a function:

>>> divmod(20, 8)
(2, 4)
>>> t = (20, 8)
>>> divmod(*t)
(2, 4)
>>> quotient, remainder = divmod(*t)
>>> quotient, remainder
(2, 4)

The preceding code shows another use of unpacking: allowing functions to return multiple values in a way that is convenient to the caller. For example, the os.path.split() function builds a tuple (path, last_part) from a filesystem path.

Visualization Tools

Online Python Tutor is an excellent online tool to visualize how Python works in detail. For example, it can show the initial and final states of a tuple after operations.

`list.sort` Versus the `sorted` Built-In

The list.sort method sorts a list in place, meaning it does not make a copy. It returns None to remind us that it changes the receiver and does not create a new list.
- Similar behavior: random.shuffle(s) shuffles the mutable sequence s in place and returns None.
- Drawback: Since it returns None, you cannot cascade calls to these methods.
In contrast, the built-in function sorted creates a new list and returns it. It accepts any iterable as an argument, including immutable sequences and generators.

Optional Arguments

reverse: If True, items are returned in descending order. The default is False.
key: A one-argument function applied to each item to produce its sorting key. For example:
- key=str.lower for case-insensitive sorting.
- key=len to sort strings by length.

These arguments are also available for min() and max() built-ins, as well as other functions like itertools.groupby() and heapq.nlargest().

Efficient Searching with `bisect`

Once your sequences are sorted, they can be efficiently searched using a binary search algorithm provided by the bisect module in Python’s standard library. The module also includes bisect.insort to ensure that your sorted sequences remain sorted.

Alternatives to Lists and Tuples

While the list type is flexible and easy to use, there are better options depending on specific requirements:

array: Saves memory when handling millions of floating-point values.
deque: A more efficient FIFO data structure for adding and removing items from both ends of a list.

Dict Comprehensions

Since Python 2.7, the syntax of list comprehensions (listcomps) and generator expressions (genexps) has been adapted to dict comprehensions (dictcomps).

Example of dictcomps to build two dictionaries from the same list of tuples:

>>> dial_codes = [
... (880, 'Bangladesh'),
... (55, 'Brazil'),
... (86, 'China'),
... (91, 'India'),
... (62, 'Indonesia'),
... (81, 'Japan'),
... (234, 'Nigeria'),
... (92, 'Pakistan'),
... (7, 'Russia'),
... (1, 'United States'),
... ]
>>> country_dial = {country: code for code, country in dial_codes}
>>> country_dial
{'Bangladesh': 880, 'Brazil': 55, 'China': 86, 'India': 91, 'Indonesia': 62,
'Japan': 81, 'Nigeria': 234, 'Pakistan': 92, 'Russia': 7, 'United States': 1}
>>> {code: country.upper()
... for country, code in sorted(country_dial.items())
... if code < 70}
{55: 'BRAZIL', 62: 'INDONESIA', 7: 'RUSSIA', 1: 'UNITED STATES'}

Dictcomps allow for concise and readable dictionary creation, especially if you’re already familiar with list comprehensions.

Hashable Objects

An object is hashable if it has a hash code that never changes during its lifetime (it needs a __hash__() method) and can be compared to other objects (it needs an __eq__() method). Hashable objects that compare equal must have the same hash code.

Numeric types and flat immutable types like str and bytes are hashable.
Container types are hashable if they are immutable and all contained objects are also hashable.
A frozenset is always hashable because every element it contains must be hashable.

Using Sets

A set is a collection of unique objects. A basic use case is removing duplicates:

>>> l = ['spam', 'spam', 'eggs', 'spam', 'bacon', 'eggs']
>>> set(l)
{'eggs', 'spam', 'bacon'}
>>> list(set(l))
['eggs', 'spam', 'bacon']

To remove duplicates while preserving the order of the first occurrence of each item:

>>> list(dict.fromkeys(l).keys())
['spam', 'eggs', 'bacon']

Set elements must be hashable. The set type itself is not hashable, so you can’t build a set with nested set instances. However, frozenset is hashable, so you can have frozenset elements inside a set.

The set types also implement many set operations as infix operators. For example, given two sets a and b:

a | b returns their union,
a & b computes the intersection,
a - b the difference,
a ^ b the symmetric difference.

Smart use of set operations can reduce both the line count and execution time of Python programs, while making the code easier to read and reason about.

Type Annotations

Type annotations are written as described in PEP 526. They make code more readable and make it easier to override or add new methods. For example:

from typing import NamedTuple

class Coordinate(NamedTuple):
    lat: float
    lon: float

    def __str__(self):
        ns = 'N' if self.lat >= 0 else 'S'
        we = 'E' if self.lon >= 0 else 'W'
        return f'{abs(self.lat):.1f}°{ns}, {abs(self.lon):.1f}°{we}'

Note: Python does not enforce type hints at runtime.

Variable Management in Python

Python variables are more like labels attached to objects than boxes containing values, as in other languages. This is because Python uses reference variables. The term “assign” can be misleading; “bind” is more appropriate in Python, where a variable is bound to an object.

`==` Versus `is`

== compares the values of objects (the data they hold).
is compares their identities (memory location).

In general, == is used more often because we care more about values. However, when comparing a variable to a singleton (like None), it is recommended to use is:

x is None
x is not None

Caution: Other uses of is are often incorrect. If unsure, use ==. Usually we are more interested in object equality than identity.

`del` and Garbage Collection

The del keyword is a statement, not a function, that deletes a reference to an object. Python’s garbage collector may discard an object from memory if the deleted variable was the last reference to it.

In Python, garbage collection is primarily done by reference counting.

Functions as First-Class Objects

In Python, functions are first-class objects. Programming language researchers define a “first-class object” as a program entity that can be:

Created at runtime
Assigned to a variable or element in a data structure
Passed as an argument to a function
Returned as the result of a function

Anonymous Functions

The lambda keyword creates an anonymous function within a Python expression. However, Python’s simple syntax limits the body of lambda functions to pure expressions. The body cannot contain other Python statements such as while, try, etc. Assignment with = is also a statement, so it cannot occur in a lambda. While the new assignment expression syntax using := can be used, if you need it, your lambda is probably too complex and should be refactored into a regular function using def.

Best Use of Anonymous Functions

Anonymous functions are most useful in the context of an argument list for a higher-order function. For instance, consider the following example that sorts a list of words by their reversed spelling using lambda:

>>> fruits = ['strawberry', 'fig', 'apple', 'cherry', 'raspberry', 'banana']
>>> sorted(fruits, key=lambda word: word[::-1])
['banana', 'apple', 'fig', 'raspberry', 'strawberry', 'cherry']

Outside the context of arguments to higher-order functions, anonymous functions are rarely useful in Python. The syntactic restrictions tend to make nontrivial lambdas either unreadable or unworkable. If a lambda is hard to read, consider refactoring it using Fredrik Lundh’s advice:

Write a comment explaining what the lambda does.
Study the comment and think of a name that captures the essence of it.
Convert the lambda to a def statement, using that name.
Remove the comment.

These steps are quoted from the “Functional Programming HOWTO,” which is a recommended read.

The Nine Flavors of Callable Objects

Python offers various callable objects, including:

User-defined functions: Created with def statements or lambda expressions.
Built-in functions: Functions implemented in C (for CPython), like len or time.strftime.
Built-in methods: Methods implemented in C, like dict.get.
Methods: Functions defined in the body of a class.
Classes: When invoked, a class runs its __new__ method to create an instance, followed by __init__ to initialize it, and returns the instance to the caller.
Class instances: If a class defines a __call__ method, its instances may be invoked as functions.
Generator functions: Functions or methods that use the yield keyword in their body. When called, they return a generator object.
Native coroutine functions: Functions or methods defined with async def. When called, they return a coroutine object.
Asynchronous generator functions: Functions or methods defined with async def that use yield in their body. When called, they return an asynchronous generator object.

About Gradual Typing

PEP 484 introduced a gradual type system to Python. Other languages with gradual type systems include Microsoft’s TypeScript, Dart, and Hack. The Mypy type checker began as a gradually typed dialect of Python before becoming a tool for checking annotated Python code. A gradual type system is characterized by the following:

Optional: Type hints are optional, and the type checker assumes Any type for code without type hints.
No Runtime Enforcement: Type hints do not prevent type errors at runtime; they are used by static type checkers, linters, and IDEs to raise warnings.
No Performance Enhancement: Type annotations do not improve performance in current Python runtimes.

Seeking 100% coverage of type hints might lead to type hinting without proper thought, merely to satisfy the metric.

Duck Typing vs. Nominal Typing

Duck typing: A dynamic approach where objects have types, but variables are untyped. The operations an object supports are more important than its declared type. For instance, if birdie.quack() works, then birdie is considered a duck. This flexibility comes at the cost of allowing more runtime errors.
Nominal typing: Used in languages like C++, Java, and C#, supported by annotated Python. Objects and variables have types. If Duck is a subclass of Bird, you can assign a Duck instance to a parameter annotated as birdie: Bird, but calling birdie.quack() would be illegal in a type checker, as Bird does not provide a .quack() method. Nominal typing catches errors earlier in development but is less flexible.

The Cognitive Effect of Typing

Type hints can benefit API users, but overly strict enforcement might discourage the creation of powerful, concise functions like Python’s max(). Type hints for such functions can be complex, possibly discouraging future developers from writing similarly powerful functions.

Decorators 101

A decorator is a callable that takes another function as an argument and returns a modified function. Decorators run immediately after the decorated function is defined, usually at import time.

Example of a basic decorator:

def decorate(func):
    # Perform processing
    return func

@decorate
def target():
    print('running target()')

This is equivalent to:

def target():
    print('running target()')
target = decorate(target)

Three key points about decorators:

A decorator is a function or another callable.
A decorator may replace the decorated function with a different one.
Decorators are executed immediately when a module is loaded.

Memoization with functools.cache

The functools.cache decorator implements memoization, optimizing functions by caching the results of previous invocations. For example, using @cache with a recursive Fibonacci function significantly improves performance:

import functools

@functools.cache
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n - 2) + fibonacci(n - 1)

This decorator ensures that each value of n is computed only once, avoiding redundant calculations.

Conformity to patterns is not a measure of goodness.
—Ralph Johnson, coauthor of the Design Patterns classic

Object Representations
Every object-oriented language has at least one standard way of getting a string representation from any object. Python has two:

repr(): Returns a string representing the object as the developer wants to see it. It’s what you get when the Python console or a debugger shows an object.
str(): Returns a string representing the object as the user wants to see it. It’s what you get when you print() an object.

The special methods __repr__ and __str__ support repr() and str().

Monkey Patching: Implementing a Protocol at Runtime
Monkey patching is dynamically changing a module, class, or function at runtime to add features or fix bugs. For example, the gevent networking library monkey patches parts of Python’s standard library to allow lightweight concurrency without threads or async/await.

Defensive Programming and “Fail Fast”
Defensive programming is like defensive driving: a set of practices to enhance safety even when faced with careless programmers—or drivers. Many bugs cannot be caught except at runtime—even in mainstream statically typed languages. In a dynamically typed language, “fail fast” is excellent advice for safer and easier-to-maintain programs. Failing fast means raising runtime errors as soon as possible.

Iteration is Fundamental to Data Processing
Iteration is fundamental to data processing: programs apply computations to data series, from pixels to nucleotides. If the data doesn’t fit in memory, we need to fetch the items lazily—one at a time and on demand. That’s what an iterator does. This chapter shows how the Iterator design pattern is built into the Python language so you never need to code it by hand.

Every standard collection in Python is iterable. An iterable is an object that provides an iterator, which Python uses to support operations like:

for loops
List, dict, and set comprehensions
Unpacking assignments
Construction of collection instances

How a Generator Works
Any Python function that has the yield keyword in its body is a generator function: a function which, when called, returns a generator object. In other words, a generator function is a generator factory. The only syntax distinguishing a plain function from a generator function is the fact that the latter has a yield keyword somewhere in its body.

I find it helpful to be rigorous when talking about values obtained from a generator. It’s confusing to say a generator “returns” values. Functions return values. Calling a generator function returns a generator. A generator yields values. A generator doesn’t “return” values in the usual way: the return statement in the body of a generator function causes StopIteration to be raised by the generator object. If you return x in the generator, the caller can retrieve the value of x from the StopIteration exception, but usually that is done automatically using the yield from syntax.

Lazy vs. Eager Evaluation in Iterators
The Iterator interface is designed to be lazy: next(my_iterator) yields one item at a time. The opposite of lazy is eager: lazy evaluation and eager evaluation are technical terms in programming language theory.

Our Sentence implementations so far have not been lazy because the __init__ eagerly builds a list of all words in the text, binding it to the self.words attribute. This requires processing the entire text, and the list may use as much memory as the text itself (probably more; it depends on how many nonword characters are in the text). Most of this work will be in vain if the user only iterates over the first couple of words. If you wonder, “Is there a lazy way of doing this in Python?” the answer is often “Yes.”

Contrasting Iterators and Generators
In the official Python documentation and codebase, the terminology around iterators and generators is inconsistent and evolving. I’ve adopted the following definitions:

Iterator: General term for any object that implements a __next__ method. Iterators are designed to produce data that is consumed by the client code, i.e., the code that drives the iterator via a for loop or other iterative feature, or by explicitly calling next(it) on the iterator—although this explicit usage is much less common. In practice, most iterators we use in Python are generators.
Generator: An iterator built by the Python compiler. To create a generator, we don’t implement __next__. Instead, we use the yield keyword to make a generator function, which is a factory of generator objects. A generator expression is another way to build a generator object. Generator objects provide __next__, so they are iterators. Since Python 3.5, we also have asynchronous generators declared with async def.

Context Managers
Context manager objects exist to control a with statement, just like iterators exist to control a for statement.

Unpacking Arguments in Python#

Visualization Tools#

list.sort Versus the sorted Built-In#

Optional Arguments#

Efficient Searching with bisect#

Alternatives to Lists and Tuples#

Dict Comprehensions#

Hashable Objects#

Using Sets#

Type Annotations#

Variable Management in Python#

== Versus is#

del and Garbage Collection#