Getting Started¶
Purpose: Understand FoodSpec's workflow and choose your path (Python vs. CLI).
Audience: Complete beginners; no spectroscopy background required.
Time: 5-10 minutes.
Prerequisites: FoodSpec installed; basic Python or terminal knowledge.
The 30-Second Version¶
# The absolute minimal example
from foodspec import __version__
print(f"FoodSpec {__version__} is ready!")
# Load some oil data
from foodspec.io import load_csv_spectra
spectra = load_csv_spectra("examples/data/oils.csv")
print(f"Loaded {len(spectra)} spectra")
Expected output:
FoodSpec 1.0.0 is ready!
Loaded 96 spectra
Choose Your Path¶
Path 1: Python API (Interactive, Customizable)¶
Best for: Learning, experimentation, custom analysis
Quickstart: 1. 15-Minute Quickstart β Get working code now 2. Oil Authentication β Real example 3. Full End-to-End β Every step explained
Key modules:
from foodspec.datasets import load_oil_example_data
from foodspec.preprocess import baseline_als, normalize_snv
from foodspec.ml import ClassifierFactory
from foodspec.validation import run_stratified_cv
Path 2: CLI (Reproducible, Automatable)¶
Best for: Reproducibility, batch processing, production use
Quickstart: 1. First Steps (CLI) β Run your first command 2. Protocol Design β Create YAML configs 3. Reproducibility β Best practices
Key commands:
foodspec --version # Verify installation
foodspec-run-protocol \ # Run analysis
--protocol myprotocol.yaml \
--input data.csv \
--output-dir results
Who is it for?¶
Food scientists, analytical chemists, QA engineers, and data scientists working with Raman/FTIR spectra who need reproducible preprocessing, chemometrics, and reporting.
What Does FoodSpec Do?¶
In: Spectral data (CSV or HDF5) + metadata (labels, batch info)
Out: Validated model + metrics + figures + reproducibility report
Typical workflow:
Raw spectra (instrument file)
β
CSV β HDF5 (standardized format)
β
Preprocess (baseline, smooth, normalize)
β
Extract features (peaks, ratios, PCA)
β
Train & validate (cross-validation)
β
Results (metrics, figures, JSON report)
Installation Options¶
Core (always install first):
pip install foodspec
Deep learning (optional, for 1D CNN):
pip install "foodspec[deep]"
Verify installation:
foodspec --version # Should print version
foodspec about # Detailed info
Data Format Quick Check¶
FoodSpec expects data in one of these formats:
CSV (Simplest)¶
sample_id,wavenumber,intensity,label
OO_001,4000.0,0.234,Olive
OO_001,3998.0,0.235,Olive
OO_002,4000.0,0.241,Olive
Load with:
from foodspec.io import load_csv_spectra
spectra = load_csv_spectra("oils.csv", label_column="label")
HDF5 (Efficient for large datasets)¶
from foodspec.io import load_hdf5_library
spectra = load_hdf5_library("oils_library.h5")
See Data Format Reference for full details.
Next Steps¶
Never used FoodSpec before? β Start with 15-Minute Quickstart
Prefer command-line? β Go to First Steps (CLI)
Want to understand reproducibility? β Read Reproducibility Guide
Ready for real examples? β Oil Authentication
FAQ (Quick Answers)¶
Q: Can I use my own data?
A: Yes! See Data Format Reference for import instructions.
Q: Do I need to know Python?
A: No! Use CLI with YAML configs. Or yes, use Python API for flexibility.
Q: How long does an analysis take?
A: Usually < 1 minute for 100 samples. Depends on data size and model.
Q: Can I reproduce old analyses?
A: Yes! Save protocols and metadata with each run. See Reproducibility.
Questions this page answers¶
Food scientists, analytical chemists, QC engineers, and data scientists working with Raman/FTIR spectra who need reproducible preprocessing, chemometrics, and reporting.
Installation¶
- Core:
pip install foodspec - Deep-learning extra (optional 1D CNN prototype):
pip install "foodspec[deep]" - Verify:
foodspec about
Data formats and metadata¶
- Instrument exports: commercial Raman/FTIR instruments often export per-spectrum TXT/CSV files (wavenumber/intensity columns) or wide CSVs (one column per spectrum).
- FoodSpec standard: convert to a validated
FoodSpectrumSet(HDF5 library) with: x: spectra matrix (n_samples Γ n_wavenumbers)wavenumbers: monotonic axis (cmβ»ΒΉ)metadata: one row per sample (e.g.,oil_type,meat_type,species,heating_time)modality:raman/ftir/nir- Why this protocol? Keeps spectra + metadata together, enables reproducible preprocessing/models, and matches downstream workflows (oil-auth, heating, QC).
Typical pipeline (text diagram)¶
Raw spectra (instrument CSV/TXT) β CSVβHDF5 library β Preprocess (baseline, smoothing, normalization, crop) β Features/chemometrics (peaks/ratios/PCA/PLS/models) β Metrics & reports (plots, JSON/Markdown).
Minimal examples (stepwise)¶
For full code, see the dedicated quickstarts. Highlights:
Python (steps)¶
1) Load library & validate. 2) Apply simple preprocessing (ALS baseline β SavitzkyβGolay β Vector norm). 3) Run PCA for a quick check.
from pathlib import Path
import matplotlib.pyplot as plt
from foodspec import load_library
from foodspec.validation import validate_spectrum_set
from foodspec.preprocess.baseline import ALSBaseline
from foodspec.preprocess.smoothing import SavitzkyGolaySmoother
from foodspec.preprocess.normalization import VectorNormalizer
from foodspec.chemometrics.pca import run_pca
fs = load_library(Path("libraries/oils_demo.h5"))
validate_spectrum_set(fs)
X = fs.x
for step in [ALSBaseline(lambda_=1e5, p=0.01, max_iter=10),
SavitzkyGolaySmoother(window_length=9, polyorder=3),
VectorNormalizer(norm="l2")]:
X = step.fit_transform(X)
_, pca_res = run_pca(X, n_components=2)
plt.scatter(pca_res.scores[:, 0], pca_res.scores[:, 1]); plt.tight_layout()
plt.savefig("pca_scores.png", dpi=150)
CLI (steps)¶
1) Convert CSV (wide example) to HDF5:
foodspec csv-to-library data/oils.csv libraries/oils.h5 \
--format wide --wavenumber-column wavenumber \
--label-column oil_type --modality raman
foodspec oil-auth libraries/oils.h5 \
--label-column oil_type \
--output-dir runs/oils_demo
Quickstarts¶
- Full CLI walkthrough: quickstart_cli.md
- Full Python walkthrough: quickstart_python.md
Links¶
- Libraries & formats: ../user-guide/libraries.md, ../user-guide/csv_to_library.md
- Workflows: oil authentication, heating, mixture, hyperspectral, QC
- User guide: CLI reference (../user-guide/cli.md), preprocessing (../methods/preprocessing/baseline_correction.md)
- Keyword lookup: ../reference/keyword_index.md
See also - ../workflows/oil_authentication.md - ../workflows/heating_quality_monitoring.md - ../user-guide/csv_to_library.md