Effective Python¶

book website: https://effectivepython.com/

Pythonic Thinking¶

Item 1: Know Which Version of Python You're Using¶

  • Python 3 is the most up-to-date and well-supported version of Python, and you should use it for your projects.
  • Be sure that the command-line executable for running Python on your system is the version you expect it to be.
  • Avoid Python 2 because it will no longer be maintained after January 1, 2020.
In [3]:
!python --version
Python 3.10.4
In [5]:
import sys

print(sys.version_info)
print(sys.version)
sys.version_info(major=3, minor=10, micro=4, releaselevel='final', serial=0)
3.10.4 (main, Jul  7 2022, 20:56:54) [Clang 13.1.6 (clang-1316.0.21.2.5)]

Item 2: Follow the PEP 8 Style Guide¶

  • Always follow the Python Enhancement Proposal #8 (PEP 8) style guide when writing Python code.
  • Sharing a common style with the larger Python community facilitates collaboration with others.
  • Using a consistent style makes it easier to modify your own code later.

Item 3: Know the Differences Between bytes and str¶

  • bytes contains sequences of 8-bit values, and str contains sequences of Unicode code points.
  • Use helper functions to ensure that the inputs you operate on are the type of character sequence that you expect (8-bit values, UTF-8-encoded strings, Unicode code points, etc).
  • bytes and str instances can’t be used together with operators (like >, ==, +, and %).
  • If you want to read or write binary data to/from a file, always open the file using a binary mode (like 'rb' or 'wb').
  • If you want to read or write Unicode data to/from a file, be careful about your system’s default text encoding. Explicitly pass the encoding parameter to open if you want to avoid surprises.
In [2]:
def to_str(bytes_or_str):
    if isinstance(bytes_or_str, bytes):
        value = bytes_or_str.decode('utf-8')
    else:
        value = bytes_or_str
    return value

print(repr(to_str(b'foo')))
print(repr(to_str('bar')))
'foo'
'bar'
In [3]:
def to_bytes(bytes_or_str):
    if isinstance(bytes_or_str, str):
        value = bytes_or_str.encode('utf-8')
    else:
        value = bytes_or_str
    return value

print(repr(to_bytes(b'foo')))
print(repr(to_bytes('bar')))
b'foo'
b'bar'
In [4]:
!python -c 'import locale; print(locale.getpreferredencoding())'
UTF-8

Item 4: Prefer Interpolated F-Strings Over C-style Format Strings and str.format¶

  • C-style format strings that use the % operator suffer from a variety of gotchas and verbosity problems.
  • The str.format method introduces some useful concepts in its formatting specifiers mini language, but it otherwise repeats the mistakes of C-style format strings and should be avoided.
  • F-strings are a new syntax for formatting values into strings that solves the biggest problems with C-style format strings.
  • F-strings are succinct yet powerful because they allow for arbitrary Python expressions to be directly embedded within format specifiers.
In [25]:
places = 3
number = 1.23456
print(f'My number is {number:.{places}f}')
My number is 1.235

Item 5: Write Helper Functions Instead of Complex Expressions¶

  • Python’s syntax makes it easy to write single-line expressions that are overly complicated and difficult to read.
  • Move complex expressions into helper functions, especially if you need to use the same logic repeatedly.
  • An if/else expression provides a more readable alternative to using the Boolean operators or and and in expressions.
In [5]:
from urllib.parse import parse_qs

my_values = parse_qs('red=5&blue=0&green=', keep_blank_values=True)
print(repr(my_values))
{'red': ['5'], 'blue': ['0'], 'green': ['']}
In [9]:
red = my_values.get('red', [''])[0] or 0
red
Out[9]:
'5'
In [6]:
def get_first_int(values, key, default=0):
    found = values.get(key, [''])

    if found[0]:
        return int(found[0])
    return default
In [7]:
print(get_first_int(my_values, 'red'))
print(get_first_int(my_values, 'blue'))
print(get_first_int(my_values, 'green'))
print(get_first_int(my_values, 'yellow'))
5
0
0
0

Item 6: Prefer Multiple Assignment Unpacking Over Indexing¶

  • Python has special syntax called unpacking for assigning multiple values in a single statement.
  • Unpacking is generalized in Python and can be applied to any iterable, including many levels of iterables within iterables.
  • Reduce visual noise and increase code clarity by using unpacking to avoid explicitly indexing into sequences.
In [11]:
snacks = [('bacon', 350), ('donut', 240), ('muffin', 190)]

for rank, (name, calories) in enumerate(snacks, 1):
    print(f'#{rank}: {name} has {calories} calories')
#1: bacon has 350 calories
#2: donut has 240 calories
#3: muffin has 190 calories

Item 7: Prefer enumerate Over range¶

  • enumerate provides concise syntax for looping over an iterator and getting the index of each item from the iterator as you go.
  • Prefer enumerate instead of looping over a range and indexing into a sequence.
  • You can supply a second parameter to enumerate to specify the number from which to begin counting (zero is the default).”

Item 8: Use zip to Process Iterators in Parallel¶

  • The zip built-in function can be used to iterate over multiple iterators in parallel.
  • zip creates a lazy generator that produces tuples, so it can be used on infinitely long inputs.
  • zip truncates its output silently to the shortest iterator if you supply it with iterators of different lengths.
  • Use the zip_longest function from the itertools built-in module if you want to use zip on iterators of unequal lengths without truncation.
In [12]:
names = ['Cecilia', 'Lise', 'Marie']
counts = [len(n) for n in names]
longest_name = None
max_count = 0

for name, count in zip(names, counts):
    if count > max_count:
        longest_name = name
        max_count = count

print(longest_name)
print(max_count)
Cecilia
7
In [13]:
names.append('Rosalind')
for name, count in zip(names, counts):
    print(name)
Cecilia
Lise
Marie
In [16]:
import itertools

for name, count in itertools.zip_longest(names, counts):
    print(f'{name}: {count}')

for name, count in itertools.zip_longest(names, counts, fillvalue=0):
    print(f'{name}: {count}')
Cecilia: 7
Lise: 4
Marie: 5
Rosalind: None
Cecilia: 7
Lise: 4
Marie: 5
Rosalind: 0

Item 9: Avoid else Blocks After for and while Loops¶

  • Python has special syntax that allows else blocks to immediately follow for and while loop interior blocks.
  • The else block after a loop runs only if the loop body did not encounter a break statement.
  • Avoid using else blocks after loops because their behavior isn’t intuitive and can be confusing.
In [17]:
a = 4
b = 9

for i in range(2, min(a, b) + 1):
    print('Testing', i)
    if a % i == 0 and b % i == 0:
        print('Not coprime')
        break
else:
    print('Coprime')
Testing 2
Testing 3
Testing 4
Coprime
In [18]:
def coprime(a, b):
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            return False

    return True

assert coprime(4, 9)
assert not coprime(3, 6)
In [19]:
def coprime_alternate(a, b):
    is_coprime = True
    for i in range(2, min(a, b) + 1):
        if a % i == 0 and b % i == 0:
            is_coprime = False
            break

    return is_coprime

assert coprime_alternate(4, 9)
assert not coprime_alternate(3, 6)

Item 10: Prevent Repetition with Assignment Expressions¶

  • Assignment expressions use the walrus operator (:=) to both assign and evaluate variable names in a single expression, thus reducing repetition.
  • When an assignment expression is a subexpression of a larger expression, it must be surrounded with parentheses.
  • Although switch/case statements and do/while loops are not available in Python, their functionality can be emulated much more clearly by using assignment expressions.
In [23]:
fresh_fruit = {
    'apple': 10,
    'banana': 8,
    'lemon': 5,
}

def make_lemonade(count):
    pass

def out_of_stock():
    pass

def slice_bananas(count):
    pass

def make_cider(count):
    pass

def make_smoothies(pieces):
    pass

if count := fresh_fruit.get('lemon', 0):
    make_lemonade(count)
else:
    out_of_stock()
In [24]:
if (count := fresh_fruit.get('banana', 0)) >= 2:
    pieces = slice_bananas(count)
    to_enjoy = make_smoothies(pieces)
elif (count := fresh_fruit.get('apple', 0)) >= 4:
    to_enjoy = make_cider(count)
elif count := fresh_fruit.get('lemon', 0):
    to_enjoy = make_lemonade(count)
else:
    to_enjoy = 'Nothing'

Lists and Dictionaries¶

Item 11: Know How to Slice Sequences¶

Item 12: Avoid Striding and Slicing in a Single Expression¶

Item 13: Prefer Catch-All Unpacking¶

Item 14: Sort by Complex Criteria Using the key Parameter¶

Item 15: Be Cautious When Relying on dict Insertion Ordering¶

Item 16: Prefer get Over in and KeyError to Handle Missing Dictionary Keys¶

Item 17: Prefer defaultdict Over setdefault to Handle Missing Items in Internal State¶

Item 18: Know How to Construct Key-Dependent Default Values with __missing__¶

Functions¶

Item 19: Never Unpack More Than Three Variables When Functions Return Multiple Values¶

Item 20: Prefer Raising Exceptions to Returning None¶

Item 21: Know How Closures Interact with Variable Scope¶

Item 22: Reduce Visual Noise with Variable Positional Arguments¶

Item 23: Provide Optional Behavior with Keyword Arguments¶

Item 24: Use None and Docstrings to Specify Dynamic Default Arguments¶

Item 25: Enforce Clarity with Keyword-Only and Positional-Only Arguments¶

Item 26: Define Function Decorators with functools.wraps¶

Comprehensions and Generators¶

Item 27: Use Comprehensions Instead of map and filter¶

Item 28: Avoid More Than Two Control Subexpressions in Comprehensions¶

Item 29: Avoid Repeated Work in Comprehensions by Using Assignment Expressions¶

Item 30: Consider Generators Instead of Returning Lists¶

Item 31: Be Defensive When Iterating Over Arguments¶

Item 32: Consider Generator Expressions for Large List Comprehensions¶

Item 33: Compose Multiple Generators with yield from¶

Item 34: Avoid Injecting Data into Generators with send¶

Item 35: Avoid Causing State Transitions in Generators with throw¶

Item 36: Consider itertools for Working with Iterators and Generators¶

Classes and Interfaces¶

Item 37: Compose Classes Instead of Nesting Many Levels of Built-in Types¶

Item 38: Accept Functions Instead of Classes for Simple Interfaces¶

Item 39: Use @classmethod Polymorphism to Construct Objects Generically¶

Item 40: Initialize Parent Classes with super¶

Item 41: Consider Composing Functionality with Mix-in Classes¶

Item 42: Prefer Public Attributes Over Private Ones¶

Item 43: Inherit from collections.abc for Custom Container Types¶

Metaclasses and Attributes¶

Item 44: Use Plain Attributes Instead of Setter and Getter Methods¶

Item 45: Consider @property Instead of Refactoring Attributes¶

Item 46: Use Descriptors for Reusable @property Methods¶

Item 47: Use __getattr__, __getattribute__, and __setattr__ for Lazy Attributes¶

Item 48: Validate Subclasses with __init_subclass__¶

Item 49: Register Class Existence with __init_subclass__¶

Item 50: Annotate Class Attributes with __set_name__¶

Item 51: Prefer Class Decorators Over Metaclasses for Composable Class Extensions¶

Concurrency and Parallelism¶

Item 52: Use subprocess to Manage Child Processes¶

Item 53: Use Threads for Blocking I/O, Avoid for Parallelism¶

Item 54: Use Lock to Prevent Data Races in Threads¶

Item 55: Use Queue to Coordinate Work Between Threads¶

Item 56: Know How to Recognize When Concurrency is Necessary¶

Item 57: Avoid Creating New Thread Instances for On-demand Fan-out¶

Item 58: Understand How Using Queue for Concurrency Requires Refactoring¶

Item 59: Consider ThreadPoolExecuter When Threads Are Necessary for Concurrency¶

Item 60: Achieve High Concurrent I/O with Coroutines¶

Item 61: Know How to Port Threaded I/O to asyncio¶

Item 62: Mix Threads and Coroutines to Ease the Transition to asyncio¶

Item 63: Avoid Blocking the asyncio Event Loop to Maximize Responsiveness¶

Item 64: Consider concurrent.futures for True Parallelism¶

Robustness and Performance¶

Item 65: Take Advantage of Each Block in try/except/else/finally¶

Item 66: Consider contextlib and with Statements for Reusable try/finally Behavior¶

Item 67: Use datetime Instead of time for Local Clocks¶

Item 68: Make pickle Reliable with copyreg¶

Item 69: Use decimal When Precision Is Paramount¶

Item 70: Profile Before Optimizing¶

Item 71: Prefer deque for Producer-Consumer Queues¶

Item 72: Consider Searching Sorted Sequences with bisect¶

Item 73: Know How to Use heapq for Priority Queues¶

Item 74: Consider memoryview and bytearray for Zero-Copy Interactions with bytes¶

Testing and Debugging¶

Item 75: Use repr Strings for Debugging Output¶

Item 76: Verify Related Behaviors in TestCase Subclasses¶

Item 77: Isolate Tests from Each Other with setUp, tearDown, setUpModule, and tearDownModule¶

Item 78: Use Mocks to Test Code with Complex Dependencies¶

Item 79: Encapsulate Dependencies to Facilitate Mocking and Testing¶

Item 80: Consider Interactive Debugging with pdb¶

Item 81: Use tracemalloc to Understand Memory Usage and Leaks¶

Collaboration¶

Item 82: Know Where to Find Community-Built Modules¶

Item 83: Use Virtual Environments for Isolated and Reproducible Dependencies¶

Item 84: Write Docstrings for Every Function, Class, and Module¶

Item 85: Use Packages to Organize Modules and Provide Stable APIs¶

Item 86: Consider Module-Scoped Code to Configure Deployment Environments¶

Item 87: Define a Root Exception to Insulate Callers from APIs¶

Item 88: Know How to Break Circular Dependencies¶

Item 89: Consider warnings to Refactor and Migrate Usage¶

Item 90: Consider Static Analysis via typing to Obviate Bugs¶