Skip to content

Features API

Feature extraction and peak analysis functions for spectral data.

The foodspec.features module provides tools for extracting meaningful features from spectra, including peak detection, band integration, and ratio-based quality metrics.

Peak Detection

detect_peaks

Detect spectral peaks with prominence and width filtering.

Detect peaks and return their properties.

Parameters:

Name Type Description Default
x ndarray

1D intensity array.

required
wavenumbers ndarray

1D axis array aligned with x.

required
prominence float

Minimum prominence passed to scipy.signal.find_peaks.

0.0
width Optional[float]

Optional width parameter for find_peaks.

None

Returns:

Type Description
DataFrame

A DataFrame with columns: peak_index, peak_wavenumber,

DataFrame

peak_intensity, prominence, width.

Raises:

Type Description
ValueError

If x and wavenumbers are not 1D or have mismatched lengths.

Examples:

>>> import numpy as np
>>> from foodspec.features.peaks import detect_peaks
>>> x = np.array([0, 1, 2, 1, 0, 3, 0])
>>> wn = np.linspace(1000, 1600, 7)
>>> peaks_df = detect_peaks(x, wn, prominence=0.5)
>>> peaks_df.shape[0] > 0
True

Band Integration

compute_band_features

Compute features from spectral bands (integral, mean, max, slope).

Compute band-level features (integral/mean/max/slope).

Parameters:

Name Type Description Default
X ndarray

2D array of spectra (samples × wavenumbers).

required
wavenumbers ndarray

1D array of wavenumber values matching X columns.

required
bands Sequence[Tuple[str, float, float]]

Sequence of (label, min_wn, max_wn) tuples defining bands.

required
metrics Iterable[str]

Feature types to compute per band ("integral", "mean", "max", "slope").

('integral',)

Returns:

Type Description
DataFrame

DataFrame with one row per sample and columns for each band × metric.

Raises:

Type Description
ValueError

If X is not 2D or wavenumbers shape mismatches X columns.

ValueError

If any band has invalid range (min_wn >= max_wn).

integrate_bands

Legacy wrapper for band integration.

Backwards-compatible wrapper: band integrals only.

Parameters:

Name Type Description Default
X ndarray

2D array of spectra (samples × wavenumbers).

required
wavenumbers ndarray

1D array of wavenumber values matching X columns.

required
bands Sequence[Tuple[str, float, float]]

Sequence of (label, min_wn, max_wn) tuples defining bands.

required

Returns:

Type Description
DataFrame

DataFrame with one row per sample and one column per band (integral only).

Similarity Metrics

cosine_similarity_matrix

Compute pairwise cosine similarity between spectra.

Compute cosine similarity matrix between reference and query spectra.

Parameters:

Name Type Description Default
X_ref ndarray

Reference spectra (n_ref, n_features).

required
X_query ndarray

Query spectra (n_query, n_features).

required

Returns:

Type Description
ndarray

Similarity matrix (n_ref, n_query) with values in [0, 1].

Examples:

>>> import numpy as np
>>> from foodspec.features.fingerprint import cosine_similarity_matrix
>>> X_ref = np.array([[1, 2, 3], [4, 5, 6]])
>>> X_query = np.array([[1, 2, 3]])
>>> sim = cosine_similarity_matrix(X_ref, X_query)
>>> sim.shape
(2, 1)

correlation_similarity_matrix

Compute pairwise Pearson correlation between spectra.

Compute Pearson correlation similarity matrix.

Parameters:

Name Type Description Default
X_ref ndarray

Reference spectra (n_ref, n_features).

required
X_query ndarray

Query spectra (n_query, n_features).

required

Returns:

Type Description
ndarray

Correlation matrix (n_ref, n_query) with values in [-1, 1].

Examples:

>>> import numpy as np
>>> from foodspec.features.fingerprint import correlation_similarity_matrix
>>> X_ref = np.array([[1, 2, 3], [4, 5, 6]])
>>> X_query = np.array([[2, 3, 4]])
>>> corr = correlation_similarity_matrix(X_ref, X_query)
>>> corr.shape
(2, 1)

Ratio Quality (RQ) Engine

RatioQualityEngine

Automated peak ratio-based quality assessment workflow.

Compute ratio quality metrics (stability, discrimination, heating trends, oil-vs-chips).

The engine expects a tidy DataFrame with metadata columns (oil/matrix/heating) and intensity columns referenced by PeakDefinition/RatioDefinition.

compare_oil_vs_chips

compare_oil_vs_chips(df)

Compare stability and heating trends between matrix types (oil vs chips). Delegates to features.rq.matrix.compare_oil_vs_chips.

generate_text_report

generate_text_report(
    stability,
    discrim,
    heating,
    oil_vs_chips,
    feat_importance,
    norm_comp,
    minimal_panel,
    clustering_metrics,
    warnings,
    context=None,
    top_k=5,
)

Generate a text report for RQ results. Delegates to features.rq.report.generate_text_report.

RQConfig

Configuration for ratio quality analysis.

Column naming conventions and options.

PeakDefinition

Define spectral peaks for ratio analysis.

Named peak/intensity column.

RatioDefinition

Define peak ratios for quality metrics.

Numerator / denominator ratio built from PeakDefinition names or raw columns.

Find nearest neighbors in a spectral library.

Perform similarity search between query and library datasets.

Parameters:

Name Type Description Default
query_ds FoodSpectrumSet

Query dataset containing spectra to match.

required
library_ds FoodSpectrumSet

Library dataset to search against.

required
metric Literal['euclidean', 'cosine', 'pearson', 'sid', 'sam']

Distance metric ("euclidean", "cosine", "pearson", "sid", "sam").

'cosine'
top_k int

Number of nearest neighbors to return per query.

5

Returns:

Type Description
DataFrame

DataFrame with search results including query_id, library_index, distance,

DataFrame

rank, and library metadata.

See Also