End-to-End Protocol (Teaching Walkthrough)¶

Focus: Chainable API • QC → Preprocess → Train → Export

What you will learn¶

Apply a full FoodSpec workflow with the unified (Phase 1) API
Perform QC, preprocessing, model training, diagnostics, and export
Keep an audit trail (parameters, metrics, diagnostics)

Prerequisites¶

Python 3.10+
numpy, pandas, scikit-learn, foodspec

Minimal runnable code¶

import pandas as pd
from foodspec.core import FoodSpec
from foodspec.datasets import load_example_data
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import cross_validate

X, y = load_example_data("oil_classification")
fs = FoodSpec(task="classification", name="oil_auth_production")

# QC
qc = fs.quality_check(X=X, y=y, check_type="complete")
assert qc["health"] > 0.5

# Preprocess
fs.add_preprocessing_step("baseline_removal", method="polyfit", order=5)
fs.add_preprocessing_step("normalization", method="vector")
Xp = fs.preprocess(X, fit=True)

# Train + CV
model = RandomForestClassifier(n_estimators=100, max_depth=15, random_state=42)
cv = cross_validate(model, Xp, y, cv=5, scoring="accuracy", return_train_score=True)
print({"cv_acc": cv["test_score"].mean(), "train_acc": cv["train_score"].mean()})

Explain the outputs¶

health > 0.5 ⇒ data quality acceptable
cv_acc vs train_acc ⇒ generalization gap check
The same preprocessing must be reused for new samples (fit=False when deploying)

Full resources¶

Full script: https://github.com/chandrasekarnarayana/foodspec/blob/main/examples/phase1_quickstart.py
Teaching notebook (download/run): https://github.com/chandrasekarnarayana/foodspec/blob/main/examples/tutorials/05_protocol_unified_api_teaching.ipynb

Run it yourself¶

python examples/phase1_quickstart.py
jupyter notebook examples/tutorials/05_protocol_unified_api_teaching.ipynb

Protocols & YAML → ../user-guide/protocols_and_yaml.md
End-to-end pipeline → ../workflows/end_to_end_pipeline.md

End-to-End Protocol (Teaching Walkthrough)¶

What you will learn¶

Prerequisites¶

Minimal runnable code¶

Explain the outputs¶

Full resources¶

Run it yourself¶

Related docs¶