Python Fundamentals & the Data Model

Difficulty

Everything is an object

In Python, 1, "hello", def f(): pass, a class, a module, and even type itself are all objects: each has an identity (id(x)), a type (type(x)), and a set of attributes. There is no distinction between "primitive types" and "reference types" the way there is in Java or C#.

def greet():
    return "hi"

print(type(greet))          # <class 'function'>
print(greet.__name__)       # 'greet'
greet.calls = 0             # you can attach arbitrary attributes to a function
greet.calls += 1

print(type(int))            # <class 'type'>
print(type(type))           # <class 'type'>  -- type is an instance of itself

Why it matters

1. Functions are first-class values. You can store them in variables, put them in lists, pass them as arguments, and return them from other functions — this is what makes decorators, callbacks, and higher-order functions like map/sorted(key=...) work.

2. Classes and types are runtime objects. class Foo: ... executes a statement that creates a type object and binds it to Foo. That's why you can build classes dynamically with type(name, bases, namespace), and why metaclasses (which customize how type builds a class) are possible.

3. Introspection is cheap and universal. Because every object exposes __dict__, __class__, type(), and dir(), generic tooling (debuggers, serializers, ORMs, pytest fixtures) can inspect any object the same way, regardless of whether it's a number, a function, or a user-defined class.

4. It underlies duck typing. Since behavior is just "does this object respond to this attribute/method," Python doesn't need a common base type to treat unrelated objects polymorphically — it just checks capabilities at the point of use.

Interview-ready summary: Python has a single, uniform object model — numbers, functions, classes, and modules are all first-class objects with identity, type, and attributes. That uniformity is why closures, decorators, metaclasses, and duck typing all work through the same mechanism: attribute access and the type system, not special-cased primitive rules.

== vs is

== invokes __eq__ and answers "are these values equal?" is answers "are these literally the same object?" (identical id()).

a = [1, 2, 3]
b = [1, 2, 3]
c = a

a == b   # True  -- same contents
a is b   # False -- two distinct list objects
a is c   # True  -- c is a name bound to the same object as a

Why a == b can be True while a is b is False

Two separate lists (or dicts, or custom objects with a custom __eq__) can be equal in value without being the same object. The default __eq__ inherited from object actually falls back to identity, but built-in containers and most user classes override it to compare contents.

When to use is

  • None checks: always x is None, never x == None. None is a singleton, and is avoids accidentally invoking a custom __eq__ that might behave unexpectedly.
  • Singletons/sentinels: x is True, or a private sentinel object (_MISSING = object()) used to distinguish "not provided" from "explicitly None".
  • Identity-sensitive logic: e.g., checking whether a cache returned the exact cached instance rather than an equal copy.

The small-integer/string trap

CPython caches small integers (-5 to 256) and some string literals, so is can appear to work for equality by coincidence:

x = 256
y = 256
x is y   # True (cached)

x = 257
y = 257
x is y   # False on most builds — a new object, no caching guarantee

This is a CPython implementation detail, not part of the language spec — relying on it for anything beyond None/True/False is a bug waiting to happen.

Interview-ready summary: == compares values via __eq__; is compares identity via id(). Always use is for None/singleton checks and == for everything else — never rely on integer/string caching as a substitute for ==.

The trap

def add_item(item, bucket=[]):
    bucket.append(item)
    return bucket

add_item("a")   # ['a']
add_item("b")   # ['a', 'b']  -- surprise! same list as before

The default [] is created once, at function-definition time, and stored on the function object (add_item.__defaults__). Every call that omits bucket reuses that exact same list, so mutations accumulate across unrelated calls.

Why immutable defaults don't have this problem

def greet(name, suffix="!"):
    return name + suffix

"!" is immutable — nothing inside greet can mutate the string object itself, so there's no shared, mutable state to leak. The bug is specific to mutable default values (list, dict, set, or any mutable custom object).

The fix

def add_item(item, bucket=None):
    if bucket is None:
        bucket = []
    bucket.append(item)
    return bucket

Now a fresh list is created on every call that doesn't supply bucket, while callers who do want to accumulate into a shared list can still pass one explicitly.

The general lesson: mutable vs immutable

  • Immutable (int, float, str, tuple, frozenset, bytes): any "modification" creates a new object; the original is never changed. Safe to share across function calls, default arguments, and dict keys.
  • Mutable (list, dict, set, most custom classes): the object can be changed in place; sharing a reference means all holders see the mutation. Never use a mutable object as a default argument, and be careful when a mutable object is a class attribute (shared across all instances) instead of an instance attribute (set in __init__).
class Bad:
    items = []          # class attribute — shared by every instance!
    def __init__(self):
        pass

class Good:
    def __init__(self):
        self.items = []  # instance attribute — one per object

Interview-ready summary: Default arguments are evaluated once at def-time and stored on the function object, so a mutable default is shared across every call that uses it. Always default mutable arguments to None and construct the real object inside the function body — and apply the same caution to mutable class attributes.

The four scopes, in lookup order

x = "global"

def outer():
    x = "enclosing"
    def inner():
        x = "local"
        print(x)          # 'local'   -- found in Local scope
    inner()
    print(x)               # 'enclosing'

outer()
print(x)                    # 'global'
print(len)                  # built-in, found in Built-in scope

When Python looks up a bare name, it checks Local, then Enclosing function scopes (innermost to outermost), then Global (module), then Built-in — the first scope where the name is bound wins.

The gotcha: assignment makes a name local for the whole function

Python decides whether a name is local to a function at compile time, by scanning the function body for assignments — not by checking whether the assignment has "already happened" at runtime.

count = 0

def increment():
    print(count)     # UnboundLocalError!
    count = count + 1

Because count = ... appears anywhere in increment, Python treats count as local for the entire function body — including the print(count) line before the assignment. It never falls back to the global count.

Fixing it: global and nonlocal

count = 0

def increment():
    global count
    count += 1          # now refers to the module-level count

def make_counter():
    total = 0
    def add(n):
        nonlocal total   # refers to make_counter's `total`, not a new local
        total += n
        return total
    return add
  • global binds a name to the module-level scope.
  • nonlocal binds a name to the nearest enclosing function scope (not global) — this is what makes stateful closures possible.

Why this matters for closures

The "E" in LEGB is exactly what lets a nested function remember variables from its enclosing function after that function has returned — the classic closure pattern (make_counter above). Without nonlocal, a nested function can read an enclosing variable freely, but assigning to it creates a new local instead of updating the enclosing one.

Interview-ready summary: Name resolution follows Local → Enclosing → Global → Built-in, and whether a name is "local" is decided statically by scanning for assignments in the function body — which is why referencing a name before assigning it in the same function raises UnboundLocalError instead of falling back to an outer scope. global and nonlocal are the explicit escape hatches for writing to an outer scope.

Collecting variable arguments

def summarize(*args, **kwargs):
    print(args)     # tuple of positional args
    print(kwargs)   # dict of keyword args

summarize(1, 2, 3, name="Ada", active=True)
# (1, 2, 3)
# {'name': 'Ada', 'active': True}

*args gathers any positional arguments beyond the named parameters into a tuple; **kwargs gathers any keyword arguments not matched by name into a dict.

Forwarding arguments (the most common real use)

def logged(func):
    def wrapper(*args, **kwargs):
        print(f"calling {func.__name__}")
        return func(*args, **kwargs)   # forward everything, unchanged
    return wrapper

This is why almost every decorator's wrapper signature is (*args, **kwargs) — it makes the wrapper work for any wrapped function signature without needing to know it in advance.

Combining with named and keyword-only parameters

def request(url, *args, timeout=30, **kwargs):
    ...

Parameter order must be: positional params → *args → keyword-only params (anything after *args must be passed by name) → **kwargs. This lets you mix a required positional API with an "escape hatch" for extra options.

Unpacking at the call site

The same */** syntax unpacks a sequence or mapping into a call:

values = (1, 2, 3)
options = {"name": "Ada"}
summarize(*values, **options)   # same as summarize(1, 2, 3, name="Ada")

Interview-ready summary: *args/**kwargs are Python's mechanism for variadic functions — *args as a tuple of extra positional arguments, **kwargs as a dict of extra keyword arguments. They're essential for writing generic wrappers (decorators, proxies) that forward calls without caring about the wrapped function's exact signature.