Nov 21, 2024

On benefits of pure functions

Pure Functions: Theory and Implementation in Python

Theoretical Foundation

Mathematical Basis

Pure functions are rooted in mathematical concepts, specifically in the idea of mathematical functions. In mathematics, a function f(x) maps each input x to exactly one output y, written as y = f(x). This mapping is:

  • Well-defined: For any given input, there is exactly one output
  • Deterministic: The same input always produces the same output
  • Independent: The output depends only on the input parameters

Formal Definition

In computer science, a function f is pure if it satisfies these properties:

  1. Referential Transparency: The function call can be replaced with its output value without changing the program's behavior
  2. No Side Effects: The function does not modify state outside its local scope
  3. Functional Dependency: The output depends only on the input parameters

Key Properties Explained

1. Referential Transparency

A function is referentially transparent if you can replace any call to the function with its result without changing the program's behavior.

# Referentially Transparent
def add(a: int, b: int) -> int:
    return a + b

result = add(5, 3)  # Can be replaced with 8
next_result = result + 2  # Equivalent to 8 + 2

# Not Referentially Transparent
def add_to_log(a: int, b: int) -> int:
    print(f"Adding {a} and {b}")  # Side effect!
    return a + b

2. No Side Effects

Side effects include:

  • Modifying global variables
  • Modifying input parameters
  • I/O operations (file, network, database)
  • Raising exceptions (debatable in some contexts)
  • Printing to console
# With side effects
total = 0
def impure_add(x: int) -> int:
    global total
    total += x
    return total

# Pure alternative
def pure_add(current_total: int, x: int) -> int:
    return current_total + x

3. Functional Dependency

The function's output should depend only on its inputs, not on any external state:

# Bad: Depends on external state
current_multiplier = 2
def impure_multiply(x: int) -> int:
    return x * current_multiplier

# Good: Only depends on inputs
def pure_multiply(x: int, multiplier: int) -> int:
    return x * multiplier

Advanced Concepts

Immutability and Pure Functions

Pure functions often work best with immutable data structures. When handling mutable objects, we should:

# Wrong way - modifies input
def add_tax_impure(prices: list, tax_rate: float) -> list:
    for i in range(len(prices)):
        prices[i] = prices[i] * (1 + tax_rate)
    return prices

# Right way - returns new list
def add_tax_pure(prices: list, tax_rate: float) -> list:
    return [price * (1 + tax_rate) for price in prices]

Higher-Order Pure Functions

Pure functions can take other functions as arguments or return functions:

from typing import Callable

def create_multiplier(factor: int) -> Callable[[int], int]:
    def multiplier(x: int) -> int:
        return x * factor
    return multiplier

double = create_multiplier(2)
triple = create_multiplier(3)

print(double(5))  # 10
print(triple(5))  # 15

Real-World Implementation Patterns

Pattern 1: State Transformation

Instead of modifying state, return new state:

from dataclasses import dataclass
from typing import List, Tuple

@dataclass(frozen=True)
class GameState:
    score: int
    level: int
    player_position: Tuple[int, int]

def move_player(
    state: GameState,
    movement: Tuple[int, int]
) -> GameState:
    new_position = (
        state.player_position[0] + movement[0],
        state.player_position[1] + movement[1]
    )
    return GameState(
        score=state.score,
        level=state.level,
        player_position=new_position
    )

Pattern 2: Error Handling

Using return values instead of exceptions for expected error cases:

from typing import Union, Tuple, Optional

def divide(
    a: float,
    b: float
) -> Tuple[Optional[float], Optional[str]]:
    if b == 0:
        return None, "Division by zero"
    return a / b, None

# Usage
result, error = divide(10, 2)
if error:
    print(f"Error: {error}")
else:
    print(f"Result: {result}")

Testing Pure Functions

Unit Testing Example

import unittest
from typing import List

def process_numbers(numbers: List[int]) -> List[int]:
    return [num * 2 for num in numbers if num > 0]

class TestProcessNumbers(unittest.TestCase):
    def test_process_numbers(self):
        # Test cases are simple and deterministic
        test_cases = [
            ([], []),
            ([1, 2, 3], [2, 4, 6]),
            ([-1, 0, 1], [2]),
            ([5, -3, 10], [10, 20])
        ]
        
        for input_nums, expected in test_cases:
            with self.subTest(input_nums=input_nums):
                result = process_numbers(input_nums)
                self.assertEqual(result, expected)

Best Practices and Guidelines

Guidelines for Writing Pure Functions

  1. Input Validation: Validate inputs at the function boundary
  2. Return New Objects: Never modify input parameters
  3. Single Responsibility: Each function should do one thing well
  4. Type Hints: Use Python's type hints for better documentation
  5. Immutable Default Arguments: Avoid mutable default arguments
  6. Document Assumptions: Clear documentation of preconditions and postconditions
from typing import List, Optional
from dataclasses import dataclass
from datetime import datetime

@dataclass(frozen=True)
class UserProfile:
    name: str
    age: int
    last_login: datetime

def process_user_profiles(
    profiles: List[UserProfile],
    min_age: Optional[int] = None
) -> List[UserProfile]:
    """
    Process user profiles and filter by age if specified.
    
    Args:
        profiles: List of user profiles to process
        min_age: Optional minimum age filter
        
    Returns:
        List of processed user profiles
        
    Raises:
        ValueError: If min_age is negative
    """
    if min_age is not None and min_age < 0:
        raise ValueError("min_age must be non-negative")
        
    filtered_profiles = (
        [p for p in profiles if p.age >= min_age]
        if min_age is not None
        else profiles.copy()
    )
    
    return filtered_profiles

Conclusion

Pure functions are fundamental to functional programming and provide numerous benefits:

  • Easier testing and debugging
  • Better code organization
  • Improved maintainability
  • Thread safety
  • Cacheable results
  • Easier to reason about

While it's not always possible or practical to make every function pure, striving for purity where possible leads to more robust and maintainable code.

No comments: