Pure Functions: Theory and Implementation in Python

Theoretical Foundation

Mathematical Basis

Pure functions are rooted in mathematical concepts, specifically in the idea of mathematical functions. In mathematics, a function f(x) maps each input x to exactly one output y, written as y = f(x). This mapping is:

Well-defined: For any given input, there is exactly one output
Deterministic: The same input always produces the same output
Independent: The output depends only on the input parameters

Formal Definition

In computer science, a function f is pure if it satisfies these properties:

Referential Transparency: The function call can be replaced with its output value without changing the program's behavior
No Side Effects: The function does not modify state outside its local scope
Functional Dependency: The output depends only on the input parameters

Key Properties Explained

1. Referential Transparency

A function is referentially transparent if you can replace any call to the function with its result without changing the program's behavior.

# Referentially Transparent
def add(a: int, b: int) -> int:
    return a + b

result = add(5, 3)  # Can be replaced with 8
next_result = result + 2  # Equivalent to 8 + 2

# Not Referentially Transparent
def add_to_log(a: int, b: int) -> int:
    print(f"Adding {a} and {b}")  # Side effect!
    return a + b

2. No Side Effects

Side effects include:

Modifying global variables
Modifying input parameters
I/O operations (file, network, database)
Raising exceptions (debatable in some contexts)
Printing to console

# With side effects
total = 0
def impure_add(x: int) -> int:
    global total
    total += x
    return total

# Pure alternative
def pure_add(current_total: int, x: int) -> int:
    return current_total + x

3. Functional Dependency

The function's output should depend only on its inputs, not on any external state:

# Bad: Depends on external state
current_multiplier = 2
def impure_multiply(x: int) -> int:
    return x * current_multiplier

# Good: Only depends on inputs
def pure_multiply(x: int, multiplier: int) -> int:
    return x * multiplier

Advanced Concepts

Immutability and Pure Functions

Pure functions often work best with immutable data structures. When handling mutable objects, we should:

# Wrong way - modifies input
def add_tax_impure(prices: list, tax_rate: float) -> list:
    for i in range(len(prices)):
        prices[i] = prices[i] * (1 + tax_rate)
    return prices

# Right way - returns new list
def add_tax_pure(prices: list, tax_rate: float) -> list:
    return [price * (1 + tax_rate) for price in prices]

Higher-Order Pure Functions

Pure functions can take other functions as arguments or return functions:

from typing import Callable

def create_multiplier(factor: int) -> Callable[[int], int]:
    def multiplier(x: int) -> int:
        return x * factor
    return multiplier

double = create_multiplier(2)
triple = create_multiplier(3)

print(double(5))  # 10
print(triple(5))  # 15

Real-World Implementation Patterns

Pattern 1: State Transformation

Instead of modifying state, return new state:

from dataclasses import dataclass
from typing import List, Tuple

@dataclass(frozen=True)
class GameState:
    score: int
    level: int
    player_position: Tuple[int, int]

def move_player(
    state: GameState,
    movement: Tuple[int, int]
) -> GameState:
    new_position = (
        state.player_position[0] + movement[0],
        state.player_position[1] + movement[1]
    )
    return GameState(
        score=state.score,
        level=state.level,
        player_position=new_position
    )

Pattern 2: Error Handling

Using return values instead of exceptions for expected error cases:

from typing import Union, Tuple, Optional

def divide(
    a: float,
    b: float
) -> Tuple[Optional[float], Optional[str]]:
    if b == 0:
        return None, "Division by zero"
    return a / b, None

# Usage
result, error = divide(10, 2)
if error:
    print(f"Error: {error}")
else:
    print(f"Result: {result}")

Testing Pure Functions

Unit Testing Example

import unittest
from typing import List

def process_numbers(numbers: List[int]) -> List[int]:
    return [num * 2 for num in numbers if num > 0]

class TestProcessNumbers(unittest.TestCase):
    def test_process_numbers(self):
        # Test cases are simple and deterministic
        test_cases = [
            ([], []),
            ([1, 2, 3], [2, 4, 6]),
            ([-1, 0, 1], [2]),
            ([5, -3, 10], [10, 20])
        ]
        
        for input_nums, expected in test_cases:
            with self.subTest(input_nums=input_nums):
                result = process_numbers(input_nums)
                self.assertEqual(result, expected)

Best Practices and Guidelines

Guidelines for Writing Pure Functions

Input Validation: Validate inputs at the function boundary
Return New Objects: Never modify input parameters
Single Responsibility: Each function should do one thing well
Type Hints: Use Python's type hints for better documentation
Immutable Default Arguments: Avoid mutable default arguments
Document Assumptions: Clear documentation of preconditions and postconditions

from typing import List, Optional
from dataclasses import dataclass
from datetime import datetime

@dataclass(frozen=True)
class UserProfile:
    name: str
    age: int
    last_login: datetime

def process_user_profiles(
    profiles: List[UserProfile],
    min_age: Optional[int] = None
) -> List[UserProfile]:
    """
    Process user profiles and filter by age if specified.
    
    Args:
        profiles: List of user profiles to process
        min_age: Optional minimum age filter
        
    Returns:
        List of processed user profiles
        
    Raises:
        ValueError: If min_age is negative
    """
    if min_age is not None and min_age < 0:
        raise ValueError("min_age must be non-negative")
        
    filtered_profiles = (
        [p for p in profiles if p.age >= min_age]
        if min_age is not None
        else profiles.copy()
    )
    
    return filtered_profiles

Conclusion

Pure functions are fundamental to functional programming and provide numerous benefits:

Easier testing and debugging
Better code organization
Improved maintainability
Thread safety
Cacheable results
Easier to reason about

While it's not always possible or practical to make every function pure, striving for purity where possible leads to more robust and maintainable code.

abalmasov's blog

Nov 21, 2024

On benefits of pure functions