Kaggle Competition Project Structure
Competition-optimized structure with experiment tracking, ensembles, and submission workflow.
Project Directory
competition/
notebooks/
Exploration and experiments
eda.ipynb
Exploratory data analysis
baseline.ipynb
First submission
experiments/
exp_001_xgb.ipynb
src/
Reusable code
__init__.py
data.py
Data loading and CV splits
features.py
Feature engineering
models.py
Model definitions
train.py
Training loop
inference.py
Test predictions
ensemble.py
Ensemble methods
input/
Competition data (gitignored)
train.csv
test.csv
sample_submission.csv
output/
Predictions and submissions
models/
oof/
Out-of-fold predictions
submissions/
configs/
Experiment configs
exp_001.yaml
requirements.txt
.gitignore
README.md
Competition notes
Why This Structure?
Optimized for competition workflow: fast iteration, OOF predictions for stacking, and organized submissions. The input/ folder mirrors Kaggle's structure. Experiments are numbered (exp_001) for easy tracking. ensemble.py combines model predictions.
Key Directories
- notebooks/experiments/-Numbered experiment notebooks
- src/ensemble.py-Blending, stacking, averaging
- output/oof/-Out-of-fold predictions for stacking
- output/submissions/-Dated submission files
- configs/-YAML configs per experiment
Experiment Naming
# Naming convention
exp_001_baseline_xgb.ipynb
exp_002_lgbm_tuned.ipynb
exp_003_nn_tabular.ipynb
exp_004_ensemble_blend.ipynb
# Submissions
sub_exp001_0.812_20241215.csv
When To Use This
- Kaggle and similar ML competitions
- Hackathons with submission deadlines
- Ensemble and stacking workflows
- Rapid experimentation cycles
Trade-offs
- Not production-ready-Optimized for score, not deployment
- Messy history-Fast iteration over clean commits
- Local focus-May need GPU setup on Kaggle