FolderStructure.dev

ML Project Package Structure

Installable Python package with proper packaging, CLI tools, and tests. Ideal for reusable ML libraries and team-shared code.

#ml #python #package #pip #library
PNGPDF

Project Directory

ml-package/
src/
src-layout for clean imports
ml_package/
Your package name
__init__.py
Version, public API
data/
__init__.py
loader.py
transforms.py
models/
__init__.py
base.py
Abstract base class
classifier.py
training/
__init__.py
trainer.py
metrics.py
cli.py
Click/Typer commands
config.py
Pydantic settings
tests/
conftest.py
Fixtures, test data
test_data.py
test_models.py
notebooks/
Examples and demos only
demo.ipynb
pyproject.toml
Modern Python packaging
README.md
.gitignore

Why This Structure?

Structured as a proper Python package using src-layout. pip install -e . makes your code importable anywhere—in notebooks, scripts, or other projects. The CLI via Click/Typer lets you run training with ml-package train --config config.yaml.

Key Directories

  • src/ml_package/-Package code, renamed to your project
  • models/-Model classes inheriting from BaseModel
  • training/-Training loops, metrics, callbacks
  • cli.py-Entry points for pyproject.toml scripts

Getting Started

  1. python -m venv venv && source venv/bin/activate
  2. pip install -e '.[dev]' (editable install with dev deps)
  3. pytest (run test suite)
  4. ml-package --help (CLI available after install)

When To Use This

  • Building reusable ML components
  • Sharing models across multiple projects
  • Creating internal ML tools for your team
  • Publishing to PyPI or private registry
  • Projects requiring proper versioning

When To Upgrade

  • Need CI/CD with model registry (MLflow, W&B)
  • Multiple model versions in production
  • Data versioning requirements (DVC)
  • Kubernetes/cloud deployment pipelines

Trade-offs

  • More boilerplate-Setup overhead vs. notebook simplicity
  • Learning curve-pyproject.toml, src-layout concepts
  • No MLOps built-in-Add DVC, MLflow separately if needed