Features API¶
Feature extraction and peak analysis functions for spectral data.
The foodspec.features module provides tools for extracting meaningful features from spectra, including peak detection, band integration, and ratio-based quality metrics.
Peak Detection¶
detect_peaks¶
Detect spectral peaks with prominence and width filtering.
Detect peaks and return their properties.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
x
|
ndarray
|
1D intensity array. |
required |
wavenumbers
|
ndarray
|
1D axis array aligned with |
required |
prominence
|
float
|
Minimum prominence passed to |
0.0
|
width
|
Optional[float]
|
Optional width parameter for |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
A DataFrame with columns: |
DataFrame
|
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Examples:
>>> import numpy as np
>>> from foodspec.features.peaks import detect_peaks
>>> x = np.array([0, 1, 2, 1, 0, 3, 0])
>>> wn = np.linspace(1000, 1600, 7)
>>> peaks_df = detect_peaks(x, wn, prominence=0.5)
>>> peaks_df.shape[0] > 0
True
Band Integration¶
compute_band_features¶
Compute features from spectral bands (integral, mean, max, slope).
Compute band-level features (integral/mean/max/slope).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
2D array of spectra (samples × wavenumbers). |
required |
wavenumbers
|
ndarray
|
1D array of wavenumber values matching X columns. |
required |
bands
|
Sequence[Tuple[str, float, float]]
|
Sequence of (label, min_wn, max_wn) tuples defining bands. |
required |
metrics
|
Iterable[str]
|
Feature types to compute per band ("integral", "mean", "max", "slope"). |
('integral',)
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with one row per sample and columns for each band × metric. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If X is not 2D or wavenumbers shape mismatches X columns. |
ValueError
|
If any band has invalid range (min_wn >= max_wn). |
integrate_bands¶
Legacy wrapper for band integration.
Backwards-compatible wrapper: band integrals only.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
2D array of spectra (samples × wavenumbers). |
required |
wavenumbers
|
ndarray
|
1D array of wavenumber values matching X columns. |
required |
bands
|
Sequence[Tuple[str, float, float]]
|
Sequence of (label, min_wn, max_wn) tuples defining bands. |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with one row per sample and one column per band (integral only). |
Similarity Metrics¶
cosine_similarity_matrix¶
Compute pairwise cosine similarity between spectra.
Compute cosine similarity matrix between reference and query spectra.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X_ref
|
ndarray
|
Reference spectra (n_ref, n_features). |
required |
X_query
|
ndarray
|
Query spectra (n_query, n_features). |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Similarity matrix (n_ref, n_query) with values in [0, 1]. |
Examples:
>>> import numpy as np
>>> from foodspec.features.fingerprint import cosine_similarity_matrix
>>> X_ref = np.array([[1, 2, 3], [4, 5, 6]])
>>> X_query = np.array([[1, 2, 3]])
>>> sim = cosine_similarity_matrix(X_ref, X_query)
>>> sim.shape
(2, 1)
correlation_similarity_matrix¶
Compute pairwise Pearson correlation between spectra.
Compute Pearson correlation similarity matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X_ref
|
ndarray
|
Reference spectra (n_ref, n_features). |
required |
X_query
|
ndarray
|
Query spectra (n_query, n_features). |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Correlation matrix (n_ref, n_query) with values in [-1, 1]. |
Examples:
>>> import numpy as np
>>> from foodspec.features.fingerprint import correlation_similarity_matrix
>>> X_ref = np.array([[1, 2, 3], [4, 5, 6]])
>>> X_query = np.array([[2, 3, 4]])
>>> corr = correlation_similarity_matrix(X_ref, X_query)
>>> corr.shape
(2, 1)
Ratio Quality (RQ) Engine¶
RatioQualityEngine¶
Automated peak ratio-based quality assessment workflow.
Compute ratio quality metrics (stability, discrimination, heating trends, oil-vs-chips).
The engine expects a tidy DataFrame with metadata columns (oil/matrix/heating) and intensity columns referenced by PeakDefinition/RatioDefinition.
compare_oil_vs_chips ¶
compare_oil_vs_chips(df)
Compare stability and heating trends between matrix types (oil vs chips).
Delegates to features.rq.matrix.compare_oil_vs_chips.
generate_text_report ¶
generate_text_report(
stability,
discrim,
heating,
oil_vs_chips,
feat_importance,
norm_comp,
minimal_panel,
clustering_metrics,
warnings,
context=None,
top_k=5,
)
Generate a text report for RQ results.
Delegates to features.rq.report.generate_text_report.
RQConfig¶
Configuration for ratio quality analysis.
PeakDefinition¶
Define spectral peaks for ratio analysis.
RatioDefinition¶
Define peak ratios for quality metrics.
Library Search¶
similarity_search¶
Find nearest neighbors in a spectral library.
Perform similarity search between query and library datasets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
query_ds
|
FoodSpectrumSet
|
Query dataset containing spectra to match. |
required |
library_ds
|
FoodSpectrumSet
|
Library dataset to search against. |
required |
metric
|
Literal['euclidean', 'cosine', 'pearson', 'sid', 'sam']
|
Distance metric ("euclidean", "cosine", "pearson", "sid", "sam"). |
'cosine'
|
top_k
|
int
|
Number of nearest neighbors to return per query. |
5
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with search results including query_id, library_index, distance, |
DataFrame
|
rank, and library metadata. |
See Also¶
- Feature Extraction Methods - Detailed methodology
- RQ Engine Theory - Ratio quality concepts
- Examples - Feature extraction workflows