ML Project Package Structure
Installable Python package with proper packaging, CLI tools, and tests. Ideal for reusable ML libraries and team-shared code.
Project Directory
ml-package/
src/
src-layout for clean imports
ml_package/
Your package name
__init__.py
Version, public API
data/
__init__.py
loader.py
transforms.py
models/
__init__.py
base.py
Abstract base class
classifier.py
training/
__init__.py
trainer.py
metrics.py
cli.py
Click/Typer commands
config.py
Pydantic settings
tests/
conftest.py
Fixtures, test data
test_data.py
test_models.py
notebooks/
Examples and demos only
demo.ipynb
pyproject.toml
Modern Python packaging
README.md
.gitignore
Why This Structure?
Structured as a proper Python package using src-layout. pip install -e . makes your code importable anywhere—in notebooks, scripts, or other projects. The CLI via Click/Typer lets you run training with ml-package train --config config.yaml.
Key Directories
- src/ml_package/-Package code, renamed to your project
- models/-Model classes inheriting from
BaseModel - training/-Training loops, metrics, callbacks
- cli.py-Entry points for
pyproject.tomlscripts
Getting Started
python -m venv venv && source venv/bin/activatepip install -e '.[dev]'(editable install with dev deps)pytest(run test suite)ml-package --help(CLI available after install)
When To Use This
- Building reusable ML components
- Sharing models across multiple projects
- Creating internal ML tools for your team
- Publishing to PyPI or private registry
- Projects requiring proper versioning
When To Upgrade
- Need CI/CD with model registry (MLflow, W&B)
- Multiple model versions in production
- Data versioning requirements (DVC)
- Kubernetes/cloud deployment pipelines
Trade-offs
- More boilerplate-Setup overhead vs. notebook simplicity
- Learning curve-pyproject.toml, src-layout concepts
- No MLOps built-in-Add DVC, MLflow separately if needed