Temporal & Aging Workflows
📋 Standard Header¶
Purpose: Model degradation trajectories over time and estimate remaining shelf-life using spectral markers.
When to Use: - Track oil oxidation over storage time to predict shelf-life - Model degradation kinetics (linear, exponential, or spline fits) - Classify samples into aging stages (early/mid/late degradation) - Estimate time-to-threshold for quality markers with confidence intervals - Compare degradation rates across treatments (antioxidants, storage conditions)
Inputs:
- Format: HDF5 or CSV with time-series spectral data
- Required metadata: time_col (time in days/hours), entity_col (sample/batch ID)
- Optional metadata: treatment, storage_temp, initial_quality
- Derived features: Degradation values (e.g., ratio 1655/1742 or peroxide value)
- Min samples: 5+ time points × 3+ entities (15+ spectra per treatment)
Outputs: - trajectories.csv — Fitted degradation curves (slope, intercept, R²) per entity - stage_labels.csv — Classified aging stage (early/mid/late) per sample - shelf_life_estimates.csv — Remaining time to threshold with 95% CI per entity - trajectory_plot.png — Degradation value vs time with fitted curves - report.md — Shelf-life predictions and degradation rate comparisons
Assumptions: - Time measured accurately and consistently across entities - Degradation monotonic (no recovery or oscillation) - Entities independent (not repeated measurements of same sample) - Linear or smooth non-linear trajectory appropriate for degradation kinetics
🔬 Minimal Reproducible Example (MRE)¶
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from foodspec.workflows.aging import compute_degradation_trajectories
from foodspec.workflows.shelf_life import estimate_remaining_shelf_life
from foodspec.core.time_spectrum_set import TimeSpectrumSet
from foodspec import SpectralDataset
# Generate synthetic aging data
def generate_synthetic_aging(n_entities=5, n_times=10, random_state=42):
"""Create synthetic time-series data with linear degradation."""
np.random.seed(random_state)
time_points = np.linspace(0, 30, n_times) # 0-30 days
data = []
for entity_id in range(n_entities):
# Each entity has different degradation rate
rate = np.random.uniform(0.05, 0.15)
intercept = np.random.uniform(0.5, 1.0)
for t in time_points:
# Degradation value = intercept + rate * time + noise
degrade_value = intercept + rate * t + np.random.normal(0, 0.05)
# Synthetic spectrum (not used in this workflow, but required for TimeSpectrumSet)
wavenumbers = np.linspace(600, 1800, 100)
spectrum = np.random.normal(0, 0.1, 100)
data.append({
'entity_id': f"E{entity_id:02d}",
'time': t,
'degrade': degrade_value,
**{f"{w:.1f}": val for w, val in zip(wavenumbers, spectrum)}
})
df = pd.DataFrame(data)
# Create SpectralDataset
fs = SpectralDataset.from_dataframe(
df,
metadata_columns=['entity_id', 'time', 'degrade'],
intensity_columns=[f"{w:.1f}" for w in wavenumbers],
wavenumber=wavenumbers
)
# Convert to TimeSpectrumSet
ts = TimeSpectrumSet(
x=fs.x,
wavenumbers=fs.wavenumbers,
metadata=fs.metadata,
modality=fs.modality,
time_col='time',
entity_col='entity_id'
)
return ts
# Generate data
ts = generate_synthetic_aging(n_entities=5, n_times=10)
print(f"Total samples: {ts.x.shape[0]}")
print(f"Entities: {ts.metadata['entity_id'].nunique()}")
print(f"Time range: {ts.metadata['time'].min():.1f}-{ts.metadata['time'].max():.1f} days")
# Compute degradation trajectories
trajectories = compute_degradation_trajectories(
ts,
value_col='degrade',
method='linear' # or 'spline' for non-linear
)
print(f"\nDegradation Trajectories:")
for entity, row in trajectories.iterrows():
print(f" {entity}: slope={row['slope']:.4f}, R²={row['r_squared']:.3f}, p={row['p_value']:.1e}")
# Plot trajectories
fig, ax = plt.subplots(figsize=(10, 6))
for entity in ts.metadata['entity_id'].unique():
mask = ts.metadata['entity_id'] == entity
times = ts.metadata.loc[mask, 'time']
values = ts.metadata.loc[mask, 'degrade']
# Plot data points
ax.scatter(times, values, label=entity, alpha=0.6)
# Plot fitted line
traj = trajectories.loc[entity]
times_fit = np.linspace(times.min(), times.max(), 100)
values_fit = traj['intercept'] + traj['slope'] * times_fit
ax.plot(times_fit, values_fit, '--', alpha=0.8)
ax.set_xlabel('Time (days)')
ax.set_ylabel('Degradation Value')
ax.set_title('Degradation Trajectories')
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
ax.grid(alpha=0.3)
plt.tight_layout()
plt.savefig("aging_trajectories.png", dpi=150, bbox_inches='tight')
print("\nSaved: aging_trajectories.png")
# Estimate shelf-life (time to threshold)
threshold = 2.0 # Quality threshold
shelf_life = estimate_remaining_shelf_life(
ts,
value_col='degrade',
threshold=threshold
)
print(f"\nShelf-Life Estimates (threshold={threshold:.1f}):")
for entity, row in shelf_life.iterrows():
print(f" {entity}: {row['remaining_time']:.1f} ± {row['ci_95']:.1f} days")
Expected Output:
Total samples: 50
Entities: 5
Time range: 0.0-30.0 days
Degradation Trajectories:
E00: slope=0.0834, R²=0.962, p=1.2e-07
E01: slope=0.1123, R²=0.978, p=2.3e-08
E02: slope=0.0645, R²=0.945, p=4.5e-07
E03: slope=0.1456, R²=0.985, p=5.6e-09
E04: slope=0.0978, R²=0.971, p=8.9e-08
Saved: aging_trajectories.png
Shelf-Life Estimates (threshold=2.0):
E00: 12.5 ± 2.3 days
E01: 8.9 ± 1.8 days
E02: 18.4 ± 3.1 days
E03: 6.2 ± 1.4 days
E04: 10.7 ± 2.0 days
✅ Validation & Sanity Checks¶
Success Indicators¶
Trajectory Fits: - ✅ R² > 0.85 for linear fits (degradation well-modeled) - ✅ p-value < 0.05 (significant degradation trend) - ✅ Residuals randomly distributed (no systematic bias)
Shelf-Life Estimates: - ✅ Confidence intervals narrow (CV < 20%) - ✅ Estimates consistent across replicates - ✅ Predictions match historical data (if available)
Chemical Plausibility: - ✅ Degradation rates match literature (e.g., 0.05–0.15 units/day for oil oxidation) - ✅ Faster degradation at higher temperatures (if temperature varied) - ✅ Shelf-life estimates within expected range (weeks to months)
Failure Indicators¶
⚠️ Warning Signs:
- R² < 0.60 for trajectory fits
- Problem: High variability; non-linear degradation; measurement noise
-
Fix: Try spline fits (method='spline'); increase replication; check SNR
-
Negative degradation slopes
- Problem: Degradation value definition inverted; samples improving over time (implausible)
-
Fix: Verify degradation value sign (should increase); check for sample contamination
-
Wide confidence intervals (> 50% of estimate)
- Problem: High within-entity variability; too few time points; poor fit
-
Fix: Increase time points; improve replication; check measurement consistency
-
Shelf-life estimates negative
- Problem: Threshold already exceeded; degradation started before t=0
-
Fix: Adjust threshold; extrapolate backwards if initial quality unknown
-
All entities reach threshold at same time (no variability)
- Problem: Degradation rates artificially constrained; batch effect dominating
- Fix: Check if entities truly independent; verify no systematic error
Quality Thresholds¶
| Metric | Minimum | Good | Excellent |
|---|---|---|---|
| Trajectory R² | 0.70 | 0.85 | 0.95 |
| Trajectory p-value | < 0.05 | < 0.01 | < 0.001 |
| Shelf-Life CI Width | < 40% | < 20% | < 10% |
| Residual Normality (Shapiro p) | > 0.05 | > 0.10 | > 0.20 |
⚙️ Parameters You Must Justify¶
Critical Parameters¶
1. Time Column
- Parameter: time_col (metadata column name)
- No default: Must specify
- Justification: "Time tracked in 'time' column (days since initial measurement)."
2. Entity Column
- Parameter: entity_col (sample/batch identifier)
- No default: Must specify
- Justification: "Independent samples identified by 'entity_id'; each tracked over time."
3. Degradation Value
- Parameter: value_col (quality metric to track)
- No default: Must specify (e.g., ratio, peroxide value, PC1)
- Justification: "Degradation quantified via ratio 1655/1742 (unsaturation/carbonyl), which decreases with oxidation."
4. Trajectory Method
- Parameter: method ('linear', 'spline')
- Default: 'linear'
- When to adjust: Use 'spline' if degradation non-monotonic or accelerates
- Justification: "Linear regression used as degradation kinetics appear first-order over measured time range."
5. Shelf-Life Threshold
- Parameter: threshold (quality limit)
- No default: Must specify based on regulations or QA limits
- Justification: "Threshold set at 2.0 based on industry standard for oil quality (ratio < 2.0 indicates rancidity)."
Quickstart (CLI)
- Aging trajectories and stages:
- foodspec aging input.h5 --value-col degrade --method linear --time-col time --entity-col sample_id --output-dir ./out
- Shelf-life estimates:
- foodspec shelf-life input.h5 --value-col degrade --threshold 2.0 --time-col time --entity-col sample_id --output-dir ./out
Python API
- Build a TimeSpectrumSet from a FoodSpectrumSet:
- ts = TimeSpectrumSet(x=fs.x, wavenumbers=fs.wavenumbers, metadata=fs.metadata, modality=fs.modality, time_col='time', entity_col='sample_id')
- Trajectories:
- from foodspec.workflows.aging import compute_degradation_trajectories
- res = compute_degradation_trajectories(ts, value_col='degrade', method='linear')
- Shelf-life:
- from foodspec.workflows.shelf_life import estimate_remaining_shelf_life
- df = estimate_remaining_shelf_life(ts, value_col='degrade', threshold=2.0)
🔗 See Also¶
Methods - Baseline Correction - Normalization & Smoothing - Feature Extraction - Model Evaluation & Validation - Statistics: Introduction - Statistics: Study Design
Examples - Aging Workflow (Storage Stability) - Heating Quality Monitoring
API - Aging Trajectories & Result Types - Shelf-Life Estimation - Time Metrics (linear slope, acceleration)