Skip to content

ML & Validation API

Model training, cross-validation, and hyperparameter optimization.

The foodspec.ml module provides tools for building, training, and validating machine learning models with rigorous cross-validation strategies.

Cross-Validation

nested_cross_validate

Nested cross-validation for unbiased model evaluation with hyperparameter tuning.

Prevents optimistic bias that arises from tuning hyperparameters on the same data used for evaluation. Critical for reliable model selection in FoodSpec.

Why Nested CV Matters: ╔═══════════════════════════════════════╗ │ Single CV (BIASED): │ │ Outer fold → tune hyperparams → eval │ │ Result: 95% accuracy (overfitted) │ ├───────────────────────────────────────┤ │ Nested CV (UNBIASED): │ │ Inner fold → tune hyperparams │ │ Outer fold → eval on unseen data │ │ Result: 82% accuracy (realistic) │ ╚═══════════════════════════════════════╝

Outer Folds (Evaluation): - Use StratifiedKFold to maintain class balance - k=5 (default): 80% train, 20% test per split - Each outer fold tests generalization to completely unseen data

Inner Folds (Hyperparameter Tuning): - Use StratifiedKFold to maintain class balance within training set - k=3 (default): Fast tuning without excessive train/test splits - Only searches over the training portion of each outer fold

Output Interpretation: - test_scores: Outer fold scores (realistic performance estimate) - train_scores: Inner fold scores (may be >95%, don't trust this!) - mean_test_score ± std_test_score: Final model performance

When to Use: - Always use for hyperparameter tuning (grid search, random search) - Every model selection decision should use nested CV - Especially critical with small datasets (n < 200)

Real Example: Model selection: Kernel SVM vs. Linear SVM ├─ Without nested CV: RBF kernel 95%, Linear 92% → choose RBF (overfitted) └─ With nested CV: RBF kernel 82%, Linear 83% → choose Linear (more stable)

Red Flags: - train_scores >> test_scores (>15% gap): Check for hyperparameter overfitting - test_scores very noisy (std > mean/2): Increase outer folds or data size - Negative scores: Likely scoring metric issue; check scoring param

Parameters:

Name Type Description Default
estimator sklearn estimator

Model to evaluate (e.g., SVC, RandomForestClassifier)

required
X ndarray

Features array (n_samples, n_features)

required
y ndarray

Target labels (n_samples,)

required
cv_outer int, default 5

Number of outer folds for evaluation

5
cv_inner int, default 3

Number of inner folds for hyperparameter tuning

3
scoring str, default "accuracy"

Scoring metric name from sklearn.metrics Options: "accuracy", "f1", "roc_auc", "precision", "recall"

'accuracy'
fit_params dict

Additional arguments to pass to estimator.fit()

None

Returns:

Name Type Description
dict Dict[str, Any]

Results containing: - test_scores: Array of outer fold test scores (realistic performance) - train_scores: Array of inner fold train scores (optimistic estimates) - mean_test_score: Mean test score across all outer folds - std_test_score: Standard deviation of test scores - all_inner_scores: Detailed inner fold scores for diagnosis

Examples:

>>> from foodspec.ml import nested_cross_validate
>>> from sklearn.svm import SVC
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=100, n_features=20)
>>> svc = SVC(kernel='rbf')
>>> result = nested_cross_validate(svc, X, y, cv_outer=5, cv_inner=3)
>>> print(f"Realistic accuracy: {result['mean_test_score']:.3f} ± {result['std_test_score']:.3f}")
References

Varma, S., & Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 7(1), 91.

Hyperparameter Tuning

grid_search_classifier

Run grid search for classifier hyperparameters.

Parameters

pipeline : Pipeline sklearn Pipeline with fitted preprocessor(s) and unfitted model. X_train : np.ndarray Training features. y_train : np.ndarray Training labels. model_name : str Model name (e.g., "rf", "svm_rbf"). cv : int, default=5 Cross-validation folds. scoring : str, default="f1_weighted" Scoring metric. n_jobs : int, default=-1 Number of parallel jobs.

Returns

best_model : estimator Fitted model with best hyperparameters. results : dict - 'best_params': best hyperparameters - 'best_score': best cross-validation score - 'cv_results': full GridSearchCV results

quick_tune_classifier

Quick hyperparameter tuning for rapid iteration.

Uses RandomizedSearchCV with smaller candidate set.

Parameters

pipeline : Pipeline sklearn Pipeline. X_train : np.ndarray Training features. y_train : np.ndarray Training labels. model_name : str Model name. cv : int, default=3 CV folds.

Returns

best_model : estimator Fitted model.

Multi-Modal Fusion

late_fusion_concat

Concatenate features from multiple modalities.

Parameters

feature_dict : Dict[str, np.ndarray] Mapping from modality name to feature matrix (n_samples, n_features). modality_order : Optional[Sequence[str]] Order of modalities for concatenation. If None, uses sorted keys.

Returns

LateFusionResult Fused feature matrix with metadata.

decision_fusion_vote

Combine class predictions via voting.

Parameters

predictions_dict : Dict[str, np.ndarray] Mapping from modality to 1D array of class predictions. strategy : Literal["majority", "unanimous"] - "majority": most frequent prediction wins. - "unanimous": only assign if all modalities agree.

Returns

DecisionFusionResult Fused predictions.

See Also