Pure Functions: Theory and Implementation in Python
Theoretical Foundation
Mathematical Basis
Pure functions are rooted in mathematical concepts, specifically in the idea of mathematical functions. In mathematics, a function f(x) maps each input x to exactly one output y, written as y = f(x). This mapping is:
- Well-defined: For any given input, there is exactly one output
- Deterministic: The same input always produces the same output
- Independent: The output depends only on the input parameters
Formal Definition
In computer science, a function f is pure if it satisfies these properties:
- Referential Transparency: The function call can be replaced with its output value without changing the program's behavior
- No Side Effects: The function does not modify state outside its local scope
- Functional Dependency: The output depends only on the input parameters
Key Properties Explained
1. Referential Transparency
A function is referentially transparent if you can replace any call to the function with its result without changing the program's behavior.
# Referentially Transparent
def add(a: int, b: int) -> int:
return a + b
result = add(5, 3) # Can be replaced with 8
next_result = result + 2 # Equivalent to 8 + 2
# Not Referentially Transparent
def add_to_log(a: int, b: int) -> int:
print(f"Adding {a} and {b}") # Side effect!
return a + b
2. No Side Effects
Side effects include:
- Modifying global variables
- Modifying input parameters
- I/O operations (file, network, database)
- Raising exceptions (debatable in some contexts)
- Printing to console
# With side effects
total = 0
def impure_add(x: int) -> int:
global total
total += x
return total
# Pure alternative
def pure_add(current_total: int, x: int) -> int:
return current_total + x
3. Functional Dependency
The function's output should depend only on its inputs, not on any external state:
# Bad: Depends on external state
current_multiplier = 2
def impure_multiply(x: int) -> int:
return x * current_multiplier
# Good: Only depends on inputs
def pure_multiply(x: int, multiplier: int) -> int:
return x * multiplier
Advanced Concepts
Immutability and Pure Functions
Pure functions often work best with immutable data structures. When handling mutable objects, we should:
# Wrong way - modifies input
def add_tax_impure(prices: list, tax_rate: float) -> list:
for i in range(len(prices)):
prices[i] = prices[i] * (1 + tax_rate)
return prices
# Right way - returns new list
def add_tax_pure(prices: list, tax_rate: float) -> list:
return [price * (1 + tax_rate) for price in prices]
Higher-Order Pure Functions
Pure functions can take other functions as arguments or return functions:
from typing import Callable
def create_multiplier(factor: int) -> Callable[[int], int]:
def multiplier(x: int) -> int:
return x * factor
return multiplier
double = create_multiplier(2)
triple = create_multiplier(3)
print(double(5)) # 10
print(triple(5)) # 15
Real-World Implementation Patterns
Pattern 1: State Transformation
Instead of modifying state, return new state:
from dataclasses import dataclass
from typing import List, Tuple
@dataclass(frozen=True)
class GameState:
score: int
level: int
player_position: Tuple[int, int]
def move_player(
state: GameState,
movement: Tuple[int, int]
) -> GameState:
new_position = (
state.player_position[0] + movement[0],
state.player_position[1] + movement[1]
)
return GameState(
score=state.score,
level=state.level,
player_position=new_position
)
Pattern 2: Error Handling
Using return values instead of exceptions for expected error cases:
from typing import Union, Tuple, Optional
def divide(
a: float,
b: float
) -> Tuple[Optional[float], Optional[str]]:
if b == 0:
return None, "Division by zero"
return a / b, None
# Usage
result, error = divide(10, 2)
if error:
print(f"Error: {error}")
else:
print(f"Result: {result}")
Testing Pure Functions
Unit Testing Example
import unittest
from typing import List
def process_numbers(numbers: List[int]) -> List[int]:
return [num * 2 for num in numbers if num > 0]
class TestProcessNumbers(unittest.TestCase):
def test_process_numbers(self):
# Test cases are simple and deterministic
test_cases = [
([], []),
([1, 2, 3], [2, 4, 6]),
([-1, 0, 1], [2]),
([5, -3, 10], [10, 20])
]
for input_nums, expected in test_cases:
with self.subTest(input_nums=input_nums):
result = process_numbers(input_nums)
self.assertEqual(result, expected)
Best Practices and Guidelines
Guidelines for Writing Pure Functions
- Input Validation: Validate inputs at the function boundary
- Return New Objects: Never modify input parameters
- Single Responsibility: Each function should do one thing well
- Type Hints: Use Python's type hints for better documentation
- Immutable Default Arguments: Avoid mutable default arguments
- Document Assumptions: Clear documentation of preconditions and postconditions
from typing import List, Optional
from dataclasses import dataclass
from datetime import datetime
@dataclass(frozen=True)
class UserProfile:
name: str
age: int
last_login: datetime
def process_user_profiles(
profiles: List[UserProfile],
min_age: Optional[int] = None
) -> List[UserProfile]:
"""
Process user profiles and filter by age if specified.
Args:
profiles: List of user profiles to process
min_age: Optional minimum age filter
Returns:
List of processed user profiles
Raises:
ValueError: If min_age is negative
"""
if min_age is not None and min_age < 0:
raise ValueError("min_age must be non-negative")
filtered_profiles = (
[p for p in profiles if p.age >= min_age]
if min_age is not None
else profiles.copy()
)
return filtered_profiles
Conclusion
Pure functions are fundamental to functional programming and provide numerous benefits:
- Easier testing and debugging
- Better code organization
- Improved maintainability
- Thread safety
- Cacheable results
- Easier to reason about
While it's not always possible or practical to make every function pure, striving for purity where possible leads to more robust and maintainable code.