FolderStructure.dev

Kaggle Competition Project Structure

Competition-optimized structure with experiment tracking, ensembles, and submission workflow.

#ml #python #kaggle #competition #data-science
PNGPDF

Project Directory

competition/
notebooks/
Exploration and experiments
eda.ipynb
Exploratory data analysis
baseline.ipynb
First submission
experiments/
exp_001_xgb.ipynb
src/
Reusable code
__init__.py
data.py
Data loading and CV splits
features.py
Feature engineering
models.py
Model definitions
train.py
Training loop
inference.py
Test predictions
ensemble.py
Ensemble methods
input/
Competition data (gitignored)
train.csv
test.csv
sample_submission.csv
output/
Predictions and submissions
models/
oof/
Out-of-fold predictions
submissions/
configs/
Experiment configs
exp_001.yaml
requirements.txt
.gitignore
README.md
Competition notes

Why This Structure?

Optimized for competition workflow: fast iteration, OOF predictions for stacking, and organized submissions. The input/ folder mirrors Kaggle's structure. Experiments are numbered (exp_001) for easy tracking. ensemble.py combines model predictions.

Key Directories

  • notebooks/experiments/-Numbered experiment notebooks
  • src/ensemble.py-Blending, stacking, averaging
  • output/oof/-Out-of-fold predictions for stacking
  • output/submissions/-Dated submission files
  • configs/-YAML configs per experiment

Experiment Naming

# Naming convention
exp_001_baseline_xgb.ipynb
exp_002_lgbm_tuned.ipynb
exp_003_nn_tabular.ipynb
exp_004_ensemble_blend.ipynb

# Submissions
sub_exp001_0.812_20241215.csv

When To Use This

  • Kaggle and similar ML competitions
  • Hackathons with submission deadlines
  • Ensemble and stacking workflows
  • Rapid experimentation cycles

Trade-offs

  • Not production-ready-Optimized for score, not deployment
  • Messy history-Fast iteration over clean commits
  • Local focus-May need GPU setup on Kaggle