IO & Data Loading API¶
Functions for loading, saving, and converting spectral data across various formats.
The foodspec.io module handles data import/export with support for CSV, HDF5, JCAMP-DX, and vendor-specific formats (Bruker OPUS, SPC).
Primary Functions¶
load_folder¶
Load spectra from a directory of text files.
Load spectra from a folder of text files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
folder
|
PathLike
|
Directory containing spectra files. |
required |
pattern
|
str
|
Glob pattern for spectra files. |
'*.txt'
|
modality
|
str
|
Spectroscopy modality label. |
'raman'
|
metadata_csv
|
Optional[PathLike]
|
Optional CSV with a |
None
|
wavenumber_column
|
int
|
Column index for wavenumbers in the spectra files. |
0
|
intensity_columns
|
Optional[Sequence[int]]
|
Optional indices for intensity columns. If multiple
are provided, their mean is taken. When omitted, all columns except
|
None
|
Returns:
| Type | Description |
|---|---|
FoodSpectrumSet
|
A |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no files match the pattern or files are malformed. |
read_spectra¶
Auto-detect format and read spectra.
Read spectra from multiple possible formats into FoodSpectrumSet.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | PathLike
|
File or folder path. |
required |
format
|
str | None
|
Optional override for the detected format. One of "csv", "folder_csv", "jcamp", "spc", "opus". |
None
|
**kwargs
|
Any
|
Extra keyword arguments forwarded to the underlying loader. |
{}
|
Returns:
| Type | Description |
|---|---|
FoodSpectrumSet
|
A |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the format is unsupported or cannot be inferred. |
detect_format¶
Identify file format by inspection.
Detect input format based on path.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | PathLike
|
File or directory path. |
required |
Returns:
| Type | Description |
|---|---|
str
|
A short string key such as "csv", "folder_csv", "jcamp", "spc", |
str
|
"opus", "txt", or "unknown". |
CSV & Text Formats¶
load_csv_spectra¶
Load spectra from CSV files (wide or long format).
Load spectra from a CSV file into a FoodSpectrumSet.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
csv_path
|
str | Path
|
Path to the CSV file. |
required |
format
|
str
|
"wide" for one row per wavenumber and one column per spectrum,
or "long" for one row per |
'wide'
|
wavenumber_column
|
str
|
Name of the wavenumber column (both formats). |
'wavenumber'
|
intensity_columns
|
Optional[Iterable[str]]
|
For "wide" format, which columns contain intensities.
If |
None
|
sample_id_column
|
str
|
For "long" format, column giving sample identifiers. |
'sample_id'
|
intensity_column
|
str
|
For "long" format, column giving intensity values. |
'intensity'
|
label_column
|
Optional[str]
|
Optional column name to copy into metadata (e.g., label). |
None
|
modality
|
str
|
Spectroscopy modality (e.g., "raman", "ftir"). |
'raman'
|
Returns:
| Type | Description |
|---|---|
FoodSpectrumSet
|
A |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the CSV file does not exist. |
ValueError
|
If required columns are missing or the format is invalid. |
read_jcamp¶
Read JCAMP-DX spectroscopy files.
Read a JCAMP-DX file (.jdx, .dx) into FoodSpectrumSet.
Minimal parser that extracts numeric pairs (wavenumber, intensity) while
ignoring header tags.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to a JCAMP-DX file. |
required |
modality
|
str
|
Spectroscopy modality label for the dataset. |
'raman'
|
Returns:
| Type | Description |
|---|---|
FoodSpectrumSet
|
A |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no spectral data is found in the file. |
Vendor Formats¶
read_opus¶
Read Bruker OPUS files (requires optional dependency).
Read a Bruker OPUS file into FoodSpectrumSet.
Uses the optional brukeropusreader package when available; raises an
informative ImportError otherwise.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to the OPUS file. |
required |
modality
|
str
|
Spectroscopy modality label. |
'ftir'
|
Returns:
| Type | Description |
|---|---|
FoodSpectrumSet
|
A |
Raises:
| Type | Description |
|---|---|
ImportError
|
If |
read_spc¶
Read Thermo Galactic SPC files (requires optional dependency).
Read an SPC file into FoodSpectrumSet.
Attempts to import known SPC readers (e.g., "spc" or "spc_io"). If none
are available, an informative ImportError is raised.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to the SPC file. |
required |
modality
|
str
|
Spectroscopy modality label. |
'raman'
|
Returns:
| Type | Description |
|---|---|
FoodSpectrumSet
|
A |
Raises:
| Type | Description |
|---|---|
ImportError
|
If an SPC reader dependency is not installed. |
Export Functions¶
to_hdf5¶
Save dataset to HDF5 format.
Persist spectra to an HDF5 file.
Stores datasets x, wavenumbers, and metadata_json (serialized via
DataFrame.to_json), plus the modality attribute.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spectra
|
FoodSpectrumSet
|
Dataset to save. |
required |
path
|
PathLike
|
Target HDF5 file path. |
required |
Raises:
| Type | Description |
|---|---|
ImportError
|
If |
to_tidy_csv¶
Export dataset to tidy (long-format) CSV.
Export spectra to a tidy (long-form) CSV file.
Produces columns sample_id, all metadata fields, wavenumber, and
intensity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spectra
|
FoodSpectrumSet
|
Dataset to export. |
required |
path
|
PathLike
|
Output file path where the CSV will be written. |
required |
See Also¶
- Core Module - Data structures for loaded spectra
- Vendor Formats Guide - Instrument file format details
- Examples - Loading examples