How do you write and organize tests with pytest (fixtures, parametrize, marks)?

pytest discovers tests as plain functions/methods named `test_*` (no boilerplate base class needed), uses plain `assert` statements (rewriting them to show rich failure diffs), and provides **fixtures** (`@pytest.fixture`) for reusable setup/teardown injected by parameter name, **`@pytest.mark.parametrize`** to run the same test logic against many inputs, and **marks** (`@pytest.mark.skip`, `.xfail`, custom marks) to control test execution.

How do you mock dependencies in tests using `unittest.mock`?

`unittest.mock.Mock`/`MagicMock` create fake objects that record how they were called and can be configured to return specific values or raise exceptions; `patch()` (as a decorator or context manager) temporarily replaces a real object/function at a given import path with a mock for the duration of a test, then restores the original automatically. This isolates the code under test from slow, flaky, or unavailable real dependencies (network calls, databases, the current time).

What's the difference between `venv`, `virtualenv`, `pipenv`, `poetry`, and `conda`?

`venv` (built into the standard library) creates an isolated Python environment with its own `site-packages`, so project dependencies don't conflict across projects. `virtualenv` is a more feature-rich third-party predecessor to `venv`. `pipenv` and `poetry` add dependency *management* on top (a lockfile for reproducible installs, a single `pyproject.toml`/`Pipfile` combining dependency declaration and environment management) — `poetry` additionally handles packaging/publishing. `conda` is a separate ecosystem that manages both Python *and* non-Python system dependencies (C libraries, compilers), popular in data science.

How do type hints and mypy/pyright improve code quality, and what are their limitations?

Type hints (`def f(x: int) -> str:`) document expected types and let static type checkers (`mypy`, `pyright`) catch type mismatches, missing arguments, and `None`-related bugs **before running the code**, without any runtime cost since hints are (mostly) ignored at runtime. The main limitation: type hints are **not enforced at runtime** by default (Python doesn't stop you from calling `f('wrong type')`), so they only catch what the type checker sees — dynamically constructed calls, `Any`-typed values, and unchecked third-party code can silently bypass them.

What's the role of `pyproject.toml`, and how has Python packaging evolved?

`pyproject.toml` (PEP 518/621) is the modern, standardized single file for declaring a project's build system, dependencies, and metadata — replacing the older pattern of a `setup.py` (executable Python code, historically a security/reproducibility concern) plus `setup.cfg`/`requirements.txt` scattered across multiple files. Nearly all modern tools (`pip`, `poetry`, `hatch`, `build`) now read project configuration from `pyproject.toml` as the single source of truth.

How do linters and formatters like ruff, flake8, and black fit into a Python workflow?

**Formatters** (`black`, and `ruff format`) auto-rewrite code to a consistent style, eliminating style debates and manual formatting effort entirely. **Linters** (`flake8`, `pylint`, `ruff check`) statically analyze code for likely bugs, unused imports/variables, and style violations without changing the code themselves. `ruff` is a newer Rust-based tool that combines both linting and formatting (compatible with `black`'s style) at dramatically higher speed than the older Python-based tools it's replacing.

How does Python's `logging` module work, and how should you configure it in an application?

`logging` organizes output through **loggers** (named, hierarchical, e.g. `myapp.db`), **handlers** (where log records go — console, file, network), **formatters** (how a record is rendered as text), and **levels** (DEBUG/INFO/WARNING/ERROR/CRITICAL, filtering what actually gets emitted). Best practice: use `logging.getLogger(__name__)` per module (never the root logger directly, never `print()`), and configure handlers/levels **once**, centrally, at the application's entry point.

What's the difference between unit tests, integration tests, and how do you measure coverage?

A **unit test** verifies a single function/class in isolation, typically mocking its dependencies, and should run fast (milliseconds) and deterministically. An **integration test** verifies multiple components working together (a real database, a real HTTP call to another service), catching issues unit tests with mocks can miss (wrong SQL, a misunderstood API contract) at the cost of being slower and sometimes flakier. **Coverage** (via `coverage.py`/`pytest-cov`) measures what percentage of code lines/branches actually executed during tests — a useful signal for finding untested code, but not a proxy for test quality by itself.

How do you structure a Python project for a distributable package?

The modern recommended layout is the **`src` layout**: application code lives under `src/mypackage/`, separate from tests, docs, and config — this forces tests to import the package as it would actually be *installed*, rather than accidentally picking up the working directory's uninstalled source (a common bug with the older "flat" layout where the package sits directly next to `setup.py`). Combine it with `pyproject.toml` for metadata and `[project.scripts]` for CLI entry points.

What role do pre-commit hooks and CI checks play in a Python project?

**Pre-commit hooks** run fast checks (formatting, linting, basic syntax) automatically on `git commit`, catching and often auto-fixing problems **before** they're even committed, giving the fastest possible feedback loop. **CI checks** re-run those same checks (as a safety net for anyone who skipped/bypassed the local hook) plus slower checks unsuitable for every commit (the full test suite, type checking, security scanning), gating whether a PR can merge.

Testing, Tooling & Packaging

pytest, mocking, virtual environments, type checking, linting, logging, and modern packaging.

Questions

10 total

10 questions in this section

Difficulty

Open as page

Plain functions, plain `assert`

# test_math.py
def add(a, b):
    return a + b

def test_add_positive_numbers():
    assert add(2, 3) == 5

def test_add_negative_numbers():
    assert add(-1, -1) == -2

No self.assertEqual(...) boilerplate (as in unittest) — pytest rewrites plain assert statements at import time to produce detailed failure output (showing both sides of a failed comparison) without any special assertion methods.

Fixtures: reusable, composable setup/teardown

import pytest

@pytest.fixture
def db_connection():
    conn = create_connection()
    yield conn          # provided to the test
    conn.close()          # teardown, runs after the test (even if it failed)

def test_query(db_connection):    # requested by parameter name
    result = db_connection.execute("SELECT 1")
    assert result == 1

A fixture requested by a test function's parameter name is automatically resolved, run, and injected by pytest — yield splits it into setup (before) and teardown (after), with teardown guaranteed to run even if the test fails. Fixtures can depend on other fixtures, be scoped (scope="module", "session") to control how often they're recreated, and be shared across a whole directory via a conftest.py.

`parametrize`: one test, many inputs

import pytest

@pytest.mark.parametrize("a, b, expected", [
    (2, 3, 5),
    (-1, -1, -2),
    (0, 0, 0),
])
def test_add(a, b, expected):
    assert add(a, b) == expected

This runs test_add three times with three different argument sets, reported as three separate test results — far more maintainable than copy-pasting near-identical test functions for each input case, and each case's failure is reported independently.

Marks: controlling test execution

@pytest.mark.skip(reason="not implemented yet")
def test_future_feature():
    ...

@pytest.mark.skipif(sys.platform == "win32", reason="POSIX-only")
def test_unix_permissions():
    ...

@pytest.mark.xfail(reason="known bug, see #123")
def test_known_broken():
    assert broken_function() == expected

skip/skipif exclude a test from the run entirely; xfail runs the test but doesn't fail the suite if it fails as expected (and flags it if it unexpectedly passes, via strict=True) — useful for tracking known issues without either deleting the test or leaving the suite red.

Organizing a test suite

tests/
    conftest.py       # shared fixtures, available to every test in this directory tree
    test_models.py
    test_views.py
    integration/
        test_api.py

conftest.py files are auto-discovered by pytest and their fixtures are available to every test in the same directory and subdirectories, without any import — the standard way to share setup logic across a test suite.

Interview-ready summary: pytest tests are plain assert-based functions with no required base class; fixtures provide composable, scoped setup/teardown injected by parameter name; parametrize runs one test body against many input sets as separate reported cases; and marks (skip/skipif/xfail) control which tests run and how failures are interpreted.

Related Resources

pytest documentation

Open as page

Basic `Mock`: recording calls, configuring return values

from unittest.mock import Mock

mock_client = Mock()
mock_client.get_user.return_value = {"id": 1, "name": "Ada"}

result = mock_client.get_user(user_id=1)
result                                # {'id': 1, 'name': 'Ada'}

mock_client.get_user.assert_called_once_with(user_id=1)   # verify how it was called
mock_client.get_user.call_count        # 1

Mock (and MagicMock, which additionally supports dunder methods like __len__/__iter__) auto-creates attributes/methods on access and records every call made to them — assert_called_with, assert_called_once, and .call_args/.call_args_list let you verify the code under test interacted with the dependency correctly, not just that it produced the right final output.

`patch()`: swapping out a real dependency temporarily

from unittest.mock import patch

# module: app/weather.py
import requests
def get_temperature(city):
    resp = requests.get(f"https://api.weather.com/{city}")
    return resp.json()["temp"]

# test
@patch("app.weather.requests.get")   # patch WHERE IT'S USED, not where it's defined
def test_get_temperature(mock_get):
    mock_get.return_value.json.return_value = {"temp": 72}
    assert get_temperature("boston") == 72
    mock_get.assert_called_once_with("https://api.weather.com/boston")

The critical rule: patch the name where it's looked up, not where it's originally defined — app.weather.requests.get, because app.weather imported requests and looks it up as requests.get in its own namespace; patching requests.get globally would work too but is broader and less precise than needed.

Context-manager form (for patching only part of a test)

def test_something():
    with patch("app.weather.requests.get") as mock_get:
        mock_get.return_value.json.return_value = {"temp": 72}
        assert get_temperature("boston") == 72
    # requests.get is back to normal here, outside the `with` block

Mocking a raised exception

@patch("app.weather.requests.get")
def test_get_temperature_handles_failure(mock_get):
    mock_get.side_effect = ConnectionError("network down")
    with pytest.raises(ConnectionError):
        get_temperature("boston")

side_effect set to an exception class/instance makes the mock raise it when called — the standard way to test error-handling paths without needing to actually trigger a real failure (a downed network, a real database outage).

`spec`/`autospec`: catching typos in mocked interfaces

from unittest.mock import create_autospec

mock_client = create_autospec(RealClient)
mock_client.get_uesr(1)   # AttributeError -- typo caught immediately, unlike a bare Mock()

A bare Mock() accepts any attribute/method name silently, which can hide a typo in test code (calling a method that doesn't actually exist on the real object) until it breaks in production. create_autospec/ spec=RealClient constrains the mock to the real object's actual interface, catching such mismatches at test time.

Interview-ready summary: Mock/MagicMock create call-recording fake objects; patch() swaps a real dependency for a mock at the import path where it's used, for the duration of a test. side_effect simulates exceptions/varying return values across calls, and autospec/spec constrain a mock to the real object's actual interface to catch typos that a bare Mock() would silently accept.

Related Resources

unittest.mock — Python docs

Open as page

The core problem all of these solve: dependency isolation

# Without isolation: installing project A's dependencies could break project B
pip install requests==2.0   # for project A
pip install requests==3.0   # for project B -- now A is broken!

Every project needs its own independent set of installed packages, so version requirements from unrelated projects never collide.

`venv`: the built-in baseline

python -m venv .venv
source .venv/bin/activate      # on Windows: .venv\Scripts\activate
pip install requests

venv creates a directory with its own Python interpreter symlink and site-packages, isolated from the system Python — no extra installation needed since it ships with Python 3.3+. It only manages the environment itself; you still track dependencies manually (typically in a requirements.txt you maintain by hand or via pip freeze).

`virtualenv`: the third-party predecessor

Functionally similar to venv but predates it, supports older Python 2 environments, and historically offered a few extra features/faster environment creation — largely superseded by the built-in venv for pure Python 3 projects, but still used in some legacy toolchains.

`pipenv`: environment + dependency management combined

pipenv install requests
pipenv install --dev pytest
pipenv shell

Combines environment creation with a Pipfile/Pipfile.lock that pins exact resolved versions (including transitive dependencies) for reproducible installs across machines — addressing venv's gap of "you manage the dependency list yourself."

`poetry`: dependency management + packaging + publishing

poetry init
poetry add requests
poetry add --group dev pytest
poetry install
poetry build      # builds a wheel/sdist
poetry publish     # publishes to PyPI

poetry centralizes dependency declaration, environment management, a lockfile (poetry.lock) for reproducibility, and the packaging/ publishing workflow (building wheels, publishing to PyPI) in one tool built around a single pyproject.toml — the most common modern choice for library/application projects that need all of this together.

`conda`: a different scope entirely

conda create -n myenv python=3.11 numpy scipy
conda activate myenv

conda manages environments that can include non-Python dependencies too (compiled C/Fortran libraries, CUDA toolkits, compilers) — this is its key differentiator, and why it dominates in data science/scientific computing where packages like NumPy/SciPy historically needed complex native builds that pip alone couldn't easily manage across platforms.

Choosing one

Need	Tool
Just isolate a Python environment, manage deps manually	`venv` + `requirements.txt`
Reproducible installs with a lockfile, simple workflow	`pipenv`
Full library/app lifecycle: deps, lockfile, packaging, publishing	`poetry` (or modern `pip` + `pyproject.toml` + `pip-tools`)
Non-Python dependencies (native libs, data science stack)	`conda`

Interview-ready summary: venv is the built-in, minimal environment isolator; virtualenv is its older third-party equivalent; pipenv and poetry add dependency locking and (for poetry) packaging on top; conda solves a broader problem — managing non-Python system dependencies alongside Python packages — which is why it's the default in data science despite overlapping with the others for pure-Python use cases.

Related Resources

venv — Python docs

Poetry documentation

Open as page

What type hints look like, and what they don't do at runtime

def greet(name: str) -> str:
    return f"hello, {name}"

greet(42)   # runs FINE at runtime -- Python doesn't check the hint!
            # f"hello, {42}" -> 'hello, 42' -- no error, just probably not intended

Type hints are not enforced by the Python interpreter itself — they're metadata, stored on the function (greet.__annotations__), that tools can optionally read and check. Calling greet(42) doesn't raise TypeError on its own; catching this mismatch requires running a static type checker separately.

Catching errors before running the code

def get_user(user_id: int) -> dict | None:
    ...

user = get_user("123")     # mypy: error: Argument 1 has incompatible type "str"; expected "int"

user = get_user(123)
print(user["name"])          # mypy: error: Item "None" of "dict | None" has no attribute "__getitem__"
                               # (get_user's return type says it might be None!)

mypy/pyright statically analyze the code (no execution needed) and flag both a wrong-type argument and a missed-None-check — the second example is a genuinely common real-world bug class (forgetting a function can return None) that static typing surfaces at review/CI time instead of as a production AttributeError.

Real benefits beyond bug-catching

IDE autocomplete/navigation improves dramatically — the editor knows a variable's type and can suggest its actual methods.
Self-documenting signatures — def process(items: list[Order]) -> Summary: communicates intent far better than an untyped signature plus a docstring that can drift out of sync.
Safer refactoring — renaming a field or changing a function's signature immediately surfaces every call site the type checker disagrees with.

The limitations

def process(data: Any) -> Any:      # Any opts OUT of checking entirely
    return data.whatever_method()    # never flagged, regardless of what `data` actually is

import third_party_untyped_lib       # if it ships no type stubs, calls into it are unchecked
result = third_party_untyped_lib.do_thing()   # typed as Any by default

Any disables checking for anything it touches — a common escape hatch that, if overused, silently reduces how much of the codebase is actually protected.
Untyped third-party code (no type stubs, no py.typed marker) is treated as Any by default, creating blind spots at every boundary with such a library.
No runtime enforcement — a caller that ignores type errors (or code paths the type checker can't see, like getattr-based dynamic dispatch, or unchecked deserialized JSON) can still pass the wrong type through at runtime; for that, use runtime validation libraries (pydantic) at actual system boundaries.
Gradual, not all-or-nothing — a codebase can be partially typed, which is often the pragmatic starting point, but means coverage (and therefore protection) varies file by file until fully adopted.

Interview-ready summary: Type hints let mypy/pyright catch type mismatches and missed-None bugs statically, before running the code, with zero runtime cost and better IDE support as a side benefit — but they're not enforced at runtime, so Any, untyped dependencies, and unchecked dynamic code remain blind spots; use runtime validation (pydantic) at actual data-entry boundaries where static checking alone isn't sufficient.

Related Resources

typing — Python docs

mypy documentation

Open as page

The old way: `setup.py` as executable code

# setup.py (legacy)
from setuptools import setup

setup(
    name="myproject",
    version="1.0.0",
    install_requires=["requests>=2.0"],
)

Because setup.py is a Python script, building or installing a package required actually executing arbitrary code just to read its metadata — a real reproducibility and security concern (a malicious or broken setup.py could do anything at install time, and different environments could produce different results running the "same" setup.py).

The modern way: declarative `pyproject.toml`

# pyproject.toml
[build-system]
requires = ["setuptools>=61.0"]
build-backend = "setuptools.build_meta"

[project]
name = "myproject"
version = "1.0.0"
dependencies = ["requests>=2.0"]
requires-python = ">=3.9"

[project.optional-dependencies]
dev = ["pytest", "mypy", "ruff"]

[project.scripts]
mycli = "myproject.cli:main"

This is plain, static TOML data — no code execution needed to read project metadata, dependencies, or entry points. [build-system] (PEP 518) declares what's needed to build the project before even importing setuptools; [project] (PEP 621) standardizes metadata that used to be scattered across setup.py/setup.cfg/Pipfile in tool-specific formats.

Why this matters: one format, many tools

[tool.poetry.dependencies]
python = "^3.9"
requests = "^2.0"

[tool.pytest.ini_options]
testpaths = ["tests"]

[tool.ruff]
line-length = 100

Beyond the standardized [project] table, tools can add their own [tool.*] sections in the same file — poetry, pytest, ruff, black, mypy all support configuration directly in pyproject.toml, consolidating what used to be setup.cfg, pytest.ini, .flake8, and various other tool-specific config files into one place.

The evolution, in short

distutils/setup.py (original, Python 2 era) — code-based, minimal metadata standardization.
setuptools + setup.cfg — moved some metadata to a declarative INI-style file, but setup.py was often still required as a shim.
pyproject.toml (current standard, PEP 518/621) — fully declarative project metadata and build-system requirements; setup.py is no longer required at all for most modern projects (though setuptools can still use one for complex custom build logic).

Building and publishing today

python -m build          # builds a wheel (.whl) and sdist (.tar.gz) from pyproject.toml
python -m twine upload dist/*   # publishes to PyPI

The build package is the modern, backend-agnostic way to build a distributable package purely from pyproject.toml, regardless of which build backend (setuptools, hatchling, poetry-core) the project uses.

Interview-ready summary: pyproject.toml replaced the historical mix of executable setup.py and scattered config files with one declarative, standardized file for build requirements, project metadata, and dependencies — read by pip, build, and virtually every modern Python tool, eliminating the need to execute arbitrary code just to discover a package's metadata.

Related Resources

Packaging Python Projects — Python Packaging Authority

PEP 621 – Storing project metadata in pyproject.toml

Open as page

Formatters: no more style debates

# before black
def   f(x,y ,z):
    return{"a":x,'b':y,"c" :z}

# after `black .`
def f(x, y, z):
    return {"a": x, "b": y, "c": z}

black (and ruff format, which implements a compatible style) rewrite code automatically to one consistent format — quote style, spacing, line length, trailing commas. The point isn't that this particular style is objectively best; it's that a team stops spending review time and mental energy on formatting bikeshedding entirely, since the tool decides and everyone's code converges to the same look.

Linters: catching likely bugs and code smells

import os          # F401: 'os' imported but unused
import sys

def process(items):
    result = []
    for item in items:
        result.append(itme.value)   # F821: undefined name 'itme' (a typo!)
    return reuslt                     # F821: undefined name 'reuslt' (another typo!)

flake8/pylint/ruff check statically scan code for problems that wouldn't necessarily crash immediately (an unused import) or would crash only when that code path actually runs (a typo in a rarely-exercised branch) — catching these at commit/CI time instead of in production. Rule sets typically cover unused imports/variables, undefined names, common bug patterns (except: bare clauses, mutable default arguments), and complexity/style conventions (PEP 8 line length, naming).

`ruff`: the modern consolidation

ruff check .      # lint (replaces flake8, isort, pyupgrade, and more, via one tool)
ruff format .      # format (black-compatible)

ruff is written in Rust and reimplements the rules of dozens of previously-separate Python linting/formatting plugins (flake8 plus its common plugin ecosystem, isort for import sorting, pyupgrade for modernizing syntax) in a single binary that runs 10-100x faster than the pure-Python tools it replaces — this speed matters enough in practice that ruff has become the default choice for new projects, often alongside or instead of both flake8 and black.

Fitting into a workflow: pre-commit and CI

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.6.0
    hooks:
      - id: ruff
      - id: ruff-format

The standard setup runs linting/formatting automatically at commit time (via pre-commit) so problems are caught before they even reach a PR, and again in CI as a required check — so no code merges without passing both the formatter's consistency check and the linter's static analysis.

Static analysis vs type checking: complementary, not overlapping

Linters catch things like unused variables and style; mypy/pyright (covered separately) catch type mismatches — both typically run in the same CI pipeline, each covering a different class of problem that the other doesn't.

Interview-ready summary: Formatters (black/ruff format) eliminate style debates by auto-rewriting code to one consistent look; linters (flake8/ruff check) statically catch likely bugs and code smells without changing code. ruff consolidates most of this Python tooling ecosystem into one fast Rust-based tool, and both are typically wired into pre-commit hooks and CI so no code merges without passing them.

Related Resources

Ruff documentation

Black documentation

Open as page

The core building blocks

import logging

logger = logging.getLogger(__name__)   # named per-module, e.g. "myapp.services.db"

logger.debug("detailed diagnostic info")
logger.info("normal operational event")
logger.warning("something unexpected, but handled")
logger.error("a real failure")
logger.critical("the application may be unable to continue")

Logger: the named entry point application code calls — hierarchical by dotted name (myapp.services.db is a child of myapp.services, which is a child of myapp), so configuration can be applied at any level and inherited by everything below it.
Level: filters what actually gets processed — a logger/handler set to WARNING silently drops DEBUG/INFO calls entirely (cheaply, with minimal overhead, since the level check happens before the message is even formatted).
Handler: decides where a passed-level record goes (console, rotating file, syslog, a remote log aggregator) — one logger can feed multiple handlers simultaneously.
Formatter: decides how a record renders as text (timestamp, level, message, module name).

Why `getLogger(name)` per module, not the root logger

# app/db.py
import logging
logger = logging.getLogger(__name__)   # 'app.db'

# app/api.py
import logging
logger = logging.getLogger(__name__)   # 'app.api'

Using __name__ automatically names each logger after its module, giving free hierarchical structure — you can later configure app.db to log at DEBUG while everything else stays at INFO, without touching any application code, just the central configuration.

Centralized configuration at the entry point

import logging

def configure_logging():
    logging.basicConfig(
        level=logging.INFO,
        format="%(asctime)s %(levelname)s %(name)s: %(message)s",
        handlers=[logging.StreamHandler(), logging.FileHandler("app.log")],
    )

# main.py
configure_logging()   # called ONCE, at startup

Configuration (levels, handlers, formatters) should happen exactly once, centrally, typically in the application's entry point — individual modules should only ever call getLogger(__name__) and log messages, never configure handlers themselves (which would risk duplicate handlers or conflicting configuration if the module is imported multiple times or from different contexts).

Why lazy `%`-style formatting matters here specifically

logger.debug("processing item %s with data %s", item_id, huge_data_structure)
# NOT: logger.debug(f"processing item {item_id} with data {huge_data_structure}")

Passing %s-style placeholders and arguments separately means the string is only actually formatted if the log record passes the level filter and reaches a handler — an f-string, by contrast, is always fully evaluated immediately, even if the log call is filtered out entirely (e.g., a DEBUG call in a production system configured for INFO), wasting the cost of formatting (potentially an expensive huge_data_structure repr) every single time regardless of whether it's ever emitted.

Never use `print()` for anything beyond a quick throwaway script

print() has no levels, no filtering, no structured output, no easy way to redirect/rotate/aggregate, and can't be selectively silenced per module — logging (or a structured logging library like structlog for more advanced needs) is the correct default for anything beyond a one-off script.

Interview-ready summary: logging structures output through named, hierarchical loggers, levels that filter cheaply, handlers that route output, and formatters that render it — configure handlers/levels once centrally at startup, call getLogger(__name__) per module everywhere else, and prefer lazy %-style formatting over f-strings in log calls so filtered-out messages cost nothing to (not) format.

Related Resources

Logging HOWTO — Python docs

Open as page

Unit tests: isolated, fast, deterministic

from unittest.mock import Mock

def calculate_total(cart, tax_service):
    subtotal = sum(item.price for item in cart)
    return subtotal + tax_service.calculate_tax(subtotal)

def test_calculate_total():
    tax_service = Mock()
    tax_service.calculate_tax.return_value = 8.0
    cart = [Mock(price=50), Mock(price=42)]

    assert calculate_total(cart, tax_service) == 100.0   # 92 + 8, using the mocked tax

tax_service is mocked, so this test verifies calculate_total's own logic in complete isolation — it runs in milliseconds, never touches a network or real tax-calculation service, and fails only if calculate_total itself has a bug (not if the real tax service is down).

Integration tests: real components, real contracts

def test_tax_service_integration(real_tax_service):
    # uses the ACTUAL tax service (or a realistic test double, e.g. a test DB)
    result = real_tax_service.calculate_tax(100.0)
    assert result == 8.0    # verifies the REAL contract, not an assumed mock behavior

If the unit test's assumption about tax_service.calculate_tax's behavior is wrong (e.g., it actually returns a Decimal, not a float, or takes different arguments), the unit test won't catch that — only an integration test exercising the real dependency will. Integration tests are slower and sometimes flakier (network, timing, external state), so they're typically run less frequently (e.g., in CI, not on every local save) and in smaller numbers than unit tests.

The testing pyramid: why the mix matters

        /\
       /  \      <- few end-to-end / integration tests (slow, high confidence)
      /----\
     /      \    <- some integration tests
    /--------\
   /          \  <- many unit tests (fast, cheap, run constantly)
  /____________\

A healthy suite has many fast unit tests giving quick feedback on logic correctness, and a smaller number of integration/end-to-end tests verifying that the pieces actually fit together correctly — relying on only one type leaves a real gap: all-unit-tests-with-mocks can pass while the real integration is broken; all-integration-tests is prohibitively slow and hard to debug when something fails.

Measuring coverage

pytest --cov=myapp --cov-report=term-missing

Name                 Stmts   Miss  Cover   Missing
--------------------------------------------------
myapp/services.py       42      3    93%   57-59
myapp/models.py          18      0   100%

Coverage tools instrument the code being tested and report which lines (and, with branch coverage, which conditional branches) actually executed during the test run — the "Missing" column pinpoints exactly which lines have zero test coverage, a good starting point for finding untested code paths.

Why coverage percentage isn't a quality proxy

def divide(a, b):
    return a / b

def test_divide():
    divide(10, 2)   # covers the line, asserts NOTHING -- 100% coverage, useless test

This test achieves 100% line coverage for divide while verifying absolutely nothing about correctness (no assert, and it doesn't even test the b == 0 case). High coverage tells you code ran during tests, not that its behavior was actually verified — it's a useful signal for finding completely untested code, not a target to chase for its own sake.

Interview-ready summary: Unit tests isolate and fast-check individual units' logic (usually with mocks); integration tests verify real components' actual contracts together, at higher cost but catching what mocked assumptions can miss. Coverage measures what code executed during tests, which is useful for finding gaps but is not itself evidence that the executed code was meaningfully verified.

Related Resources

coverage.py documentation

Open as page

The `src` layout

myproject/
├── pyproject.toml
├── README.md
├── src/
│   └── mypackage/
│       ├── __init__.py
│       ├── core.py
│       └── cli.py
└── tests/
    ├── test_core.py
    └── test_cli.py

[project]
name = "mypackage"

[project.scripts]
mycli = "mypackage.cli:main"

mypackage lives under src/, not at the project root next to pyproject.toml — this is deliberate, not just tidiness.

Why `src` layout beats the flat layout

# Flat layout (older convention) -- package sits at the project root
myproject/
├── setup.py
├── mypackage/          <- right next to setup.py
│   └── __init__.py
└── tests/

With a flat layout, running tests from the project root can accidentally import the uninstalled source directly (since the current directory is often on sys.path), even if the package was never properly pip install-ed — tests can pass locally while the actual built and installed package is broken (e.g., missing a data file that wasn't included in the package manifest). The src layout physically prevents this: mypackage isn't importable at all unless it's actually installed (even via pip install -e . for local development), so tests exercise the real installed package, matching what end users will actually get.

Editable installs for local development

pip install -e .        # "editable install" -- installs the package, but pointing at src/

An editable install lets you edit src/mypackage/*.py and immediately see changes reflected without reinstalling, while still going through the proper import mechanism (not a sys.path accident) — the standard way to develop against the src layout locally.

Entry points for CLI tools

[project.scripts]
mycli = "mypackage.cli:main"

After installation, this makes mycli available as a real shell command that calls mypackage.cli.main() — the standard way to ship a command-line tool as part of a package, rather than telling users to run python -m mypackage.cli manually.

Keeping tests, docs, and config out of the shipped package

tests/          <- not shipped to end users, only needed for development
docs/           <- likewise
pyproject.toml  <- build/dev config, not shipped as importable code

Separating these from src/mypackage/ keeps the actual distributed package (the wheel end users pip install) minimal — containing only what's needed to run the library/application, not the development tooling around it.

Interview-ready summary: The src layout puts the actual package under src/mypackage/, separate from tests/docs/config, specifically to prevent tests from accidentally importing uninstalled source instead of the real installed package — combined with pyproject.toml for metadata and [project.scripts] for CLI entry points, it's the standard modern structure for a distributable Python project.

Related Resources

src layout vs flat layout — Python Packaging Authority

Open as page

Pre-commit: catch problems before they're even committed

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/astral-sh/ruff-pre-commit
    rev: v0.6.0
    hooks:
      - id: ruff
      - id: ruff-format
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.6.0
    hooks:
      - id: trailing-whitespace
      - id: check-merge-conflict
      - id: check-added-large-files

pre-commit install     # one-time, wires hooks into .git/hooks/pre-commit
git commit -m "..."      # hooks run automatically; a failing hook blocks the commit

Once installed, every git commit automatically runs the configured checks against the changed files — a formatting violation gets auto-fixed by ruff-format right then (and the commit is blocked until you re-stage the fixed files), and a lint error is reported immediately, long before a reviewer or CI would otherwise see it. This is the fastest possible feedback loop: seconds, on your own machine, before the change even leaves your laptop.

CI: the safety net and the home for slower checks

# .github/workflows/ci.yml (simplified)
jobs:
  test:
    steps:
      - run: pip install -e ".[dev]"
      - run: ruff check .
      - run: mypy src/
      - run: pytest --cov=mypackage

CI re-runs the same fast checks (in case someone used git commit --no-verify to skip hooks, or hasn't installed pre-commit locally at all) and also runs checks too slow or heavyweight for every single commit: the full test suite (which might take minutes), static type checking across the whole codebase, security scanning, and building the actual package to confirm it installs cleanly.

Why both layers, not just one

	Pre-commit	CI
Runs on	changed files, at commit time	the full PR, on every push
Speed	seconds	can be minutes
Enforced for everyone	only if hooks are installed locally	always (it's a required check gating merge)
Good for	fast, auto-fixable checks (format, lint)	anything, especially slow/expensive checks (full test suite, type checking)

Pre-commit alone isn't sufficient because it's opt-in per developer machine (someone can skip installing it, or bypass it with --no-verify) — CI is the actual enforced gate. Pre-commit exists purely to shorten the feedback loop so most problems never reach CI (and therefore never cost a reviewer's attention or a slow CI run) in the first place.

A well-configured pipeline layers both

Pre-commit (local, instant): formatting, basic linting, trailing whitespace, merge-conflict markers, large-file checks.
CI (per-PR, required to merge): everything pre-commit checks again (as a backstop) + the full test suite + type checking + build verification + (often) security/dependency scanning.

Interview-ready summary: Pre-commit hooks give instant, local, often-auto-fixing feedback on fast checks (formatting, linting) before a commit is even made; CI is the actually-enforced gate that re-runs those checks as a safety net and additionally runs everything too slow for every commit (full test suite, type checking, security scans) before a PR can merge. Neither replaces the other — pre-commit shortens the feedback loop, CI guarantees the standard is actually met.

Related Resources

pre-commit documentation

Testing, Tooling & Packaging

How do you write and organize tests with pytest (fixtures, parametrize, marks)?

Plain functions, plain assert

Fixtures: reusable, composable setup/teardown

parametrize: one test, many inputs

Marks: controlling test execution

Organizing a test suite

Related Resources

How do you mock dependencies in tests using `unittest.mock`?

Basic Mock: recording calls, configuring return values

patch(): swapping out a real dependency temporarily

Context-manager form (for patching only part of a test)

Mocking a raised exception

spec/autospec: catching typos in mocked interfaces

Related Resources

What's the difference between `venv`, `virtualenv`, `pipenv`, `poetry`, and `conda`?

The core problem all of these solve: dependency isolation

venv: the built-in baseline

virtualenv: the third-party predecessor

pipenv: environment + dependency management combined

poetry: dependency management + packaging + publishing

conda: a different scope entirely

Choosing one

Related Resources

How do type hints and mypy/pyright improve code quality, and what are their limitations?

What type hints look like, and what they don't do at runtime

Catching errors before running the code

Real benefits beyond bug-catching

The limitations

Related Resources

What's the role of `pyproject.toml`, and how has Python packaging evolved?

The old way: setup.py as executable code

The modern way: declarative pyproject.toml

Why this matters: one format, many tools

The evolution, in short

Building and publishing today

Related Resources

How do linters and formatters like ruff, flake8, and black fit into a Python workflow?

Formatters: no more style debates

Linters: catching likely bugs and code smells

ruff: the modern consolidation

Fitting into a workflow: pre-commit and CI

Static analysis vs type checking: complementary, not overlapping

Related Resources

How does Python's `logging` module work, and how should you configure it in an application?

The core building blocks

Why getLogger(__name__) per module, not the root logger

Centralized configuration at the entry point

Why lazy %-style formatting matters here specifically

Never use print() for anything beyond a quick throwaway script

Related Resources

What's the difference between unit tests, integration tests, and how do you measure coverage?

Unit tests: isolated, fast, deterministic

Integration tests: real components, real contracts

The testing pyramid: why the mix matters

Measuring coverage

Why coverage percentage isn't a quality proxy

Related Resources

How do you structure a Python project for a distributable package?

The src layout

Why src layout beats the flat layout

Editable installs for local development

Entry points for CLI tools

Keeping tests, docs, and config out of the shipped package

Related Resources

What role do pre-commit hooks and CI checks play in a Python project?

Pre-commit: catch problems before they're even committed

CI: the safety net and the home for slower checks

Why both layers, not just one

A well-configured pipeline layers both

Related Resources

Plain functions, plain `assert`

`parametrize`: one test, many inputs

Basic `Mock`: recording calls, configuring return values

`patch()`: swapping out a real dependency temporarily

`spec`/`autospec`: catching typos in mocked interfaces

`venv`: the built-in baseline

`virtualenv`: the third-party predecessor

`pipenv`: environment + dependency management combined

`poetry`: dependency management + packaging + publishing

`conda`: a different scope entirely

The old way: `setup.py` as executable code

The modern way: declarative `pyproject.toml`

`ruff`: the modern consolidation

Why `getLogger(name)` per module, not the root logger

Why lazy `%`-style formatting matters here specifically

Never use `print()` for anything beyond a quick throwaway script

The `src` layout

Why `src` layout beats the flat layout