How does Python's import system work (modules, packages, `sys.path`)?
Quick Answer
`import x` searches `sys.modules` (already-imported cache) first, then searches the directories in `sys.path` (script dir, `PYTHONPATH`, installed site-packages) using **finders** and **loaders**. A directory becomes a regular package if it has an `__init__.py` (optional since Python 3.3, which introduced namespace packages). Each module is executed once and cached in `sys.modules`, so re-importing just returns the cached module object.
Detailed Answer
The lookup sequence
When you run import foo, Python:
- Checks
sys.modules["foo"]— if already imported, returns the cached module object immediately (imports are idempotent and side-effect-free after the first time). - Otherwise, walks
sys.path(a list of directories: the script's own directory,PYTHONPATHentries, and the standard library/site-packages paths) using finders, which locate the module and return a spec. - A loader then executes the module's code in a fresh namespace,
which becomes the module object, and stores it in
sys.modules.
import sys
print(sys.path) # search directories, in order
print(sys.modules.keys()) # every module imported so far, cached
Packages vs modules
- A module is a single
.pyfile. - A package is a directory containing modules (and possibly
subpackages). Historically it required an
__init__.py(even if empty) to mark the directory as a package; since PEP 420 (Python 3.3), directories without__init__.pycan act as namespace packages and still be importable, though most real projects still use__init__.pyfor explicit control over what a package exports.
myapp/
__init__.py
models.py
utils/
__init__.py
strings.py
from myapp.utils.strings import slugify
Absolute vs relative imports
# absolute — resolved from sys.path, preferred for clarity
from myapp.utils import strings
# relative — resolved from the current package's position
from .strings import slugify # same package
from ..models import User # parent package
Relative imports only work inside a package (a module run directly as a
script has no package context), which is a common source of
ImportError: attempted relative import with no known parent package.
Circular imports
If a.py imports b.py and b.py imports a.py, whichever module runs
first will see a partially initialized version of the other (only the
names defined before the circular import line exist). Common fixes: move
the import inside the function that needs it (deferred import), restructure
shared code into a third module, or import the module object itself
(import a) instead of pulling names out of it at import time.
Interview-ready summary: Imports are cached in sys.modules and only
execute a module's top-level code once; the search path is sys.path,
walked by finders/loaders. Packages are directories of modules (optionally
marked by __init__.py); relative imports resolve against the current
package, and circular imports break when one module is only
partially initialized by the time the other needs it.