Skip to content

Standard Workflow Templates

📋 Standard Header

Purpose: Provide quick-reference templates for common spectroscopic analysis tasks with standardized structure.

When to Use: - Starting a new analysis and need a proven structure - Adapting workflows to new food matrices or research questions - Teaching standardized approaches to new users - Documenting methods for publications or regulatory submissions - Comparing results across labs using consistent methodology

What's Included: - Authentication/Classification template (SVM/RF for class ID) - Adulteration detection template (class imbalance strategies) - Calibration/Regression template (PLS for continuous targets) - Time-series/Degradation template (trend analysis) - Mixture quantification template (NNLS/MCR-ALS) - Hyperspectral mapping template (spatial analysis) - Reporting checklist (parameters, plots, metrics)

How to Use: - Choose template matching your scientific question - Follow step-by-step structure (preprocess → features → model → metrics → report) - Adapt parameters to your specific matrix and instrument - Consult detailed workflow pages for in-depth guidance


This page lists concise templates you can adapt for common tasks. Each template references the relevant detailed workflow pages and points to troubleshooting/metrics/ML chapters.

Authentication / Classification

  • Goal: Identify class (e.g., oil type) or detect adulteration.
  • Template:
  • Load spectra (CSV/JCAMP/OPUS) with read_spectra.
  • Preprocess: baseline → smoothing → normalization → crop.
  • Features: peaks/ratios + optional PCA.
  • Model: SVM/RF (or logreg baseline).
  • Metrics: accuracy, F1_macro, confusion matrix; PR/ROC as needed.
  • Reports: confusion matrix + per-class metrics; export run metadata/model.
  • See: Oil authentication, ML & metrics.

Adulteration (rare events)

  • Same as authentication, but emphasize class imbalance:
  • Use class weights, PR curves, F1_macro; collect more positives.
  • Consider OC-SVM/IsolationForest for novelty.
  • See: Batch QC / novelty, Troubleshooting.

Calibration / Regression

  • Goal: Predict continuous quality/mixture values.
  • Template:
  • Preprocess consistently (baseline, norm, crop).
  • Feature space: raw spectra, ratios, or PLS components.
  • Model: PLS regression; consider MLP if non-linear bias remains.
  • Metrics: RMSE, MAE, R², MAPE; plots: calibration + residuals.
  • Robustness: bootstrap/permutation; check bias across range.
  • Reports: predicted vs true, residual plots, parameter settings.
  • See: Calibration example, Metrics.
  • Goal: Track degradation markers vs time/temperature.
  • Template: ratios vs time → trend models (linear/ANOVA) → slopes/p-values → plots (line + CI, box/violin by stage).
  • See: Heating quality monitoring, Stats.

Mixtures

  • Goal: Estimate component fractions.
  • Template: NNLS with pure refs or MCR-ALS → metrics (RMSE/R²) → predicted vs true/residual plots.
  • See: Mixture analysis.

Hyperspectral mapping

  • Goal: Spatial localization.
  • Template: per-pixel preprocessing → cube rebuild → ratios/PCs → clustering/classification → maps + pixel metrics.
  • See: Hyperspectral mapping.

Reporting essentials

  • Record preprocessing parameters, model choices, metrics with uncertainty, plots, and configs; export run metadata/model artifacts.
  • Consult Reporting guidelines and Troubleshooting when issues arise.

When Results Cannot Be Trusted

⚠️ Red flags for template-based workflows:

  1. Using template with different domain/instrument without revalidation (oil template applied to dairy without testing)
  2. Templates are domain-specific; spectral signatures, sample prep, and matrix effects differ
  3. Model trained on oils won't work on milk or meat
  4. Fix: Validate template on target domain before use; test on 10+ samples to confirm applicability

  5. Template preprocessing parameters not adjusted for new matrix (using oil normalization on dairy proteins)

  6. Preprocessing optimal for one food type may be poor for another
  7. Different absorbance ranges, solubility, fluorescence require different settings
  8. Fix: Test preprocessing on new matrix; adjust parameters (baseline lambda, smoothing); freeze before analysis

  9. Template model directly deployed without cross-validation on new data (assuming old model works)

  10. Model trained on past data; generalization to new batches/instruments unverified
  11. Drift, seasonal changes, or instrumental variation can invalidate model
  12. Fix: Cross-validate model on representative new samples; retrain if performance drops >10%

  13. Features/ratios from template used blindly without domain interpretation

  14. Template features may be arbitrary (optimized for one dataset, not chemically meaningful)
  15. Different food type may have different separating features
  16. Fix: Validate template features are chemically plausible for new domain; check loadings/importance

  17. Metrics thresholds from template applied without local calibration (using template accuracy cutoff 0.85 for new domain)

  18. Template thresholds calibrated on template data; new domain may need adjustment
  19. Class distributions and difficulty differ across domains
  20. Fix: Recalibrate decision thresholds on new data; validate at operational point (sensitivity/specificity target)

  21. Template applied to imbalanced data without rebalancing

  22. Templates often assume balanced classes; imbalanced deployment inflates majority class accuracy
  23. Minority class performance may be poor
  24. Fix: Stratify CV; use class weights; retrain if classes severely imbalanced

  25. No documentation of why template was chosen or when it's appropriate

  26. Templates are gray boxes; unclear which template fits which problem
  27. Can lead to inappropriate template selection
  28. Fix: Document template scope (domain, food type, spectroscopy method); include decision flowchart

  29. Template results trusted without sensitivity analysis (no testing on edge cases or outliers)

  30. Templates may fail on unusual samples (off-spec, oxidized, contaminated)
  31. Real samples have variability beyond template training distribution
  32. Fix: Test template on challenging samples (old oils, mixed types, degraded); document failure modes