Skip to content

Theory – Harmonization & Calibration

Purpose: Understand multi-instrument harmonization, calibration drift correction, and cross-batch transfer.
Audience: Users working with multiple instruments or bridging data across time/batches.
Time to read: 15–20 minutes.
Prerequisites: Familiar with preprocessing steps and baseline correction.


Why harmonization matters for multi-instrument and multi-batch spectral studies: - Instrument drift: wavenumber axes can shift over time; calibration curves correct drift and align spectra to a common grid. - Intensity scaling: laser power and detector response vary; power/area normalization reduces cross-instrument intensity bias. - Residual variation: diagnostics (pre/post alignment plots, residual metrics) quantify how well instruments agree after harmonization. - FAIR & reproducibility: storing harmonization parameters and calibration metadata in HDF5/metadata ensures runs can be reproduced and audited.

For practical steps, see hsi_and_harmonization.md and cookbook_preprocessing.md.


When Results Cannot Be Trusted

⚠️ Red flags for harmonization validity:

  1. Harmonization parameters fit on same data used for analysis (overfitting)
  2. Calibration curves optimized to training data may not generalize
  3. Test data from same batch don't validate cross-instrument transfer
  4. Fix: Use held-out data to fit calibration; validate transfer on independent instruments

  5. Wavenumber shift corrected but baseline residuals not checked (after alignment, baseline still tilted)

  6. Incomplete harmonization; residual differences confound analysis
  7. Baseline correction parameters must also be harmonized
  8. Fix: Visualize aligned spectra; check baseline and baseline-corrected spectra match post-harmonization

  9. Single reference standard used for calibration (all instruments calibrated to one oil, one day)

  10. Single point doesn't characterize full instrument behavior
  11. Non-linearity or drift undetected
  12. Fix: Use multiple reference standards (low, medium, high across relevant range)

  13. Harmonization parameters not validated on new instruments/times

  14. Parameters fit on Device A; deploying on Device B or future time without revalidation
  15. Drift changes parameters; old calibration fails
  16. Fix: Periodically revalidate harmonization; retrain if test-set performance degrades >5%

  17. Instrument differences masked but not removed (harmonization makes spectra similar, but underlying differences remain)

  18. Post-harmonization, residual variation still attributable to instrument
  19. Analysis conclusions conflated with instrument artifacts
  20. Fix: Document residual unexplained variance; test whether instrument explains batch effects

  21. Different preprocessing methods applied pre-harmonization (one instrument baseline-corrected, another not)

  22. Preprocessing fundamentally changes spectra; can't harmonize pre-processed to unprocessed
  23. Calibration curves don't transfer
  24. Fix: Apply identical preprocessing to all data before harmonization

  25. Harmonization assumes linear intensity correction, but non-linearity present

  26. Linear calibration (intensity = a*reference + b) may be oversimplified
  27. Residuals show systematic patterns; correction incomplete
  28. Fix: Check residuals; use higher-order or spline-based corrections if non-linearity detected

  29. Transfer learning (using parameters from one domain to harmonize another) without testing

  30. Example: oil-harmonization parameters applied to dairy without validation
  31. Different matrices have different intensity/baseline characteristics
  32. Fix: Retrain/validate parameters on target domain; don't assume transfer

Next Steps