Statistics API¶

Statistical testing and effect size calculations for spectral analysis.

The foodspec.stats module provides hypothesis testing, correlation analysis, and statistical reporting tools for comparing spectral measurements.

Time Metrics¶

Functions for analyzing temporal degradation trends.

linear_slope¶

Linear trend slope and intercept for time series.

Compute linear slope and intercept for time series.

Parameters:

Name	Type	Description	Default
`t`	`ndarray`	Time points (1D array).	required
`y`	`ndarray`	Observation values (1D array, same length as t).	required

Returns:

Type	Description
`Tuple[float, float]`	Tuple of (slope, intercept) from linear fit.

Raises:

Type	Description
`ValueError`	If t and y are not 1D arrays of equal length.

quadratic_acceleration¶

Acceleration coefficient from a quadratic fit to time series.

Compute quadratic acceleration (second derivative) for time series.

Parameters:

Name	Type	Description	Default
`t`	`ndarray`	Time points (1D array).	required
`y`	`ndarray`	Observation values (1D array, same length as t).	required

Returns:

Type	Description
`float`	Acceleration coefficient (2 × quadratic coefficient from y = at² + bt + c).

Raises:

Type	Description
`ValueError`	If t and y are not 1D arrays of equal length.

rolling_slope¶

Rolling-window linear slope across time series.

Compute rolling window linear slope for time series.

Parameters:

Name	Type	Description	Default
`t`	`ndarray`	Time points (1D array).	required
`y`	`ndarray`	Observation values (1D array, same length as t).	required
`window`	`int`	Rolling window size (must be >= 2). Defaults to 5.	`5`

Returns:

Type	Description
`ndarray`	Array of rolling slopes (same length as t; early values are NaN).

Raises:

Type	Description
`ValueError`	If window < 2.
`ValueError`	If t and y are not 1D arrays of equal length.

Hypothesis Testing¶

run_ttest¶

Independent or paired t-test.

t-test: one-sample, paired, or two-sample (Welch's).

Tests whether sample mean(s) differ significantly from a population mean or each other. Welch's t-test (default for two-sample) does NOT assume equal variances, making it more robust than Student's t-test.

Test Selection: - One-sample: run_ttest(x, popmean=5) — Does x differ from 5? - Paired: run_ttest(before, after, paired=True) — Before vs. after? - Two-sample: run_ttest(groupA, groupB) — Do groups differ?

Assumptions: Normality (robust if n > 30), independence, random sampling

Significance: See Metric Significance Tables

Parameters:

Name	Type	Description	Default
`sample1`	`array - like`	First sample	required
`sample2`	`array - like`	Second sample	`None`
`popmean`	`float`	Population mean (for one-sample)	`None`
`paired`	`bool`	Whether samples are paired	`False`
`alternative`	`str`	"two-sided", "less", or "greater"	`'two-sided'`

Returns:

Type	Description
`TestResult`	TestResult with t-statistic, p-value, df

Examples:

>>> from foodspec.stats import run_ttest
>>> result = run_ttest([1, 2, 3], [4, 5, 6])
>>> print(f"p = {result.pvalue:.3f}")

See Also

T-tests & Effect Sizes
run_mannwhitney_u(): Non-parametric alternative

run_anova¶

One-way ANOVA for comparing multiple groups.

One-way ANOVA: test if 3+ groups have different means.

Tests null hypothesis that all group means are equal. ANOVA partitions variance into between-group and within-group components. F-statistic is their ratio: large F → significant difference.

When to Use: - 3+ groups (use t-test for 2 groups) - Roughly equal sample sizes per group - Data approximately normal (robust if n > 20 per group)

Assumptions: Normality, homogeneity of variance, independence

Post-hoc Tests (if p < 0.05): - Balanced design: Tukey HSD (run_tukey_hsd) - Unequal variances: Games-Howell (games_howell) - Multiple hypotheses: Benjamini-Hochberg FDR (benjamini_hochberg)

Red Flags: - Significant ANOVA but non-significant post-hoc: likely Type I error - Large effect size (η² > 0.14) but p > 0.05: underpowered

Parameters:

Name	Type	Description	Default
`data`	`array - like`	All observations	required
`groups`	`array - like`	Group labels (same length as data)	required

Returns:

Type	Description
`TestResult`	TestResult with F-statistic, p-value

Examples:

>>> from foodspec.stats import run_anova
>>> result = run_anova([1, 2, 1, 5, 6, 5, 9, 10, 9],
...                    ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'])
>>> assert result.pvalue < 0.05  # Groups differ

See Also

ANOVA & MANOVA
Metric Significance Tables — Effect size (η²)
run_kruskal_wallis(): Non-parametric alternative

run_manova¶

Multivariate ANOVA for multiple dependent variables.

MANOVA via statsmodels.

Supports two usage patterns: - run_manova(df, group_col="group", dependent_cols=["f1","f2"]) # formula - run_manova(data_df, groups=labels) # use all columns in data_df

Returns a TestResult with pvalue extracted from the MANOVA table (prefers Wilks' lambda, falls back to Pillai's trace).

run_kruskal_wallis¶

Non-parametric alternative to ANOVA.

Perform Kruskal-Wallis H-test for comparing three or more independent groups.

The Kruskal-Wallis test is a non-parametric alternative to one-way ANOVA. It tests whether samples from different groups come from the same distribution. Use this when comparing 3+ groups with non-normal data or unequal variances.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame \| Iterable`	Either a DataFrame or list of arrays. If DataFrame, must provide group_col and value_col. If list, each element is an array of values for one group.	required
`group_col`	`str \| None`	Column name for group labels (required if data is DataFrame).	`None`
`value_col`	`str \| None`	Column name for values to compare (required if data is DataFrame).	`None`

Returns:

Type	Description
`TestResult`	TestResult with: - statistic: Kruskal-Wallis H statistic - pvalue: p-value from chi-squared distribution - df: None (df implicit in chi-squared approximation) - summary: DataFrame with test details

Raises:

Type	Description
`ValueError`	If fewer than 2 groups provided or if DataFrame arguments are missing.

Example

from foodspec import run_kruskal_wallis import pandas as pd df = pd.DataFrame({ ... 'group': ['A']10 + ['B']10 + ['C']*10, ... 'value': list(range(10)) + list(range(10, 20)) + list(range(20, 30)) ... }) result = run_kruskal_wallis(df, 'group', 'value') print(f"H={result.statistic:.2f}, p={result.pvalue:.4f}")

See Also

run_anova: Parametric alternative for normal data run_mannwhitney_u: For comparing exactly 2 groups run_friedman_test: For repeated measures (paired data)

run_mannwhitney_u¶

Non-parametric alternative to t-test.

Perform Mann-Whitney U test for comparing two independent samples.

The Mann-Whitney U test (also called Wilcoxon rank-sum test) is a non-parametric test for comparing the distributions of two independent groups. It does not assume normality and is robust to outliers. Use this when data are ordinal or continuous but violate t-test assumptions.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame \| Iterable`	Either a DataFrame or array-like. If DataFrame, must provide group_col and value_col. If array-like, represents first sample (requires group_col as second sample or unpacking into two samples).	required
`group_col`	`object \| None`	Column name for group labels (if data is DataFrame) or second sample array (if data is array-like).	`None`
`value_col`	`str \| None`	Column name for values to compare (required if data is DataFrame).	`None`
`alternative`	`str`	Direction of test - "two-sided", "less", or "greater" (default: "two-sided").	`'two-sided'`

Returns:

Type	Description
`TestResult`	TestResult with: - statistic: Mann-Whitney U statistic - pvalue: Two-tailed (or one-tailed) p-value - df: None (non-parametric test has no degrees of freedom) - summary: DataFrame with test details

Raises:

Type	Description
`ValueError`	If data format is invalid or if not exactly two groups provided.

Example

from foodspec import run_mannwhitney_u import pandas as pd df = pd.DataFrame({'group': ['A']10 + ['B']10, 'value': range(20)}) result = run_mannwhitney_u(df, 'group', 'value') print(f"U={result.statistic:.2f}, p={result.pvalue:.4f}")

See Also

run_ttest: Parametric alternative for normal data run_kruskal_wallis: Extension to 3+ groups run_wilcoxon_signed_rank: For paired samples

Effect Sizes¶

compute_cohens_d¶

Cohen's d effect size for group comparisons.

Compute Cohen's d effect size for two groups.

Quantifies the magnitude of difference between two groups independent of sample size. Essential for evaluating whether statistically significant differences are also practically significant.

Parameters:

Name	Type	Description	Default
`group1`	`ndarray`	First group numerical samples.	required
`group2`	`ndarray`	Second group numerical samples.	required
`pooled`	`bool`	If True, use pooled standard deviation (assumes equal variances). If False, average unpooled standard deviations.	`True`

Returns:

Type	Description
`float`	Cohen's d effect size (unbounded; can be negative).

Examples:

>>> from foodspec.stats.effects import compute_cohens_d
>>> import numpy as np
>>> g1 = np.array([1, 2, 3, 4, 5])
>>> g2 = np.array([3, 4, 5, 6, 7])
>>> d = compute_cohens_d(g1, g2)
>>> abs(d) > 0.5
True

compute_anova_effect_sizes¶

Effect sizes (eta-squared, omega-squared) for ANOVA.

Compute eta-squared and partial eta-squared for ANOVA.

Quantifies proportion of variance explained by group differences.

Interpretation Scale: - eta-squared < 0.01: Negligible effect - 0.01 <= eta-squared < 0.06: Small effect - 0.06 <= eta-squared < 0.14: Medium effect - eta-squared >= 0.14: Large effect

Parameters:

Name	Type	Description	Default
`ss_between`	`float`	Sum of squares between groups (treatment effect).	required
`ss_total`	`float`	Total sum of squares (all variation in data).	required
`ss_within`	`float \| None`	Sum of squares within groups (error). If provided, partial eta-squared is computed.	`None`

Returns:

Type	Description
`Series`	A Series with 'eta_squared' and optionally 'partial_eta_squared'.

Examples:

>>> from foodspec.stats.effects import compute_anova_effect_sizes
>>> result = compute_anova_effect_sizes(ss_between=50, ss_total=200, ss_within=150)
>>> 'eta_squared' in result
True

Correlations & Distances¶

compute_correlations¶

Pearson or Spearman correlation between features.

Compute correlation between columns in a DataFrame.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	DataFrame containing the columns of interest (e.g., ratios vs quality metric).	required
`cols`	`tuple \| list`	Two column names to correlate (tuple or list).	required
`method`	`str`	'pearson' or 'spearman', by default 'pearson'.	`'pearson'`

Returns:

Type	Description
`Series`	Series with index ['r', 'pvalue']; values are correlation coefficient and p-value.

Raises:

Type	Description
`ValueError`	If cols does not contain exactly two column names.
`ValueError`	If method is not 'pearson' or 'spearman'.

euclidean_distance¶

Euclidean distance between spectra.

Compute Euclidean distance between two spectra.

Parameters:

Name	Type	Description	Default
`a`	`ndarray`	First spectrum (1D or flattened array).	required
`b`	`ndarray`	Second spectrum (1D or flattened array).	required

Returns:

Type	Description
`float`	Euclidean distance (L2 norm of difference).

cosine_distance¶

Cosine distance (1 - cosine similarity).

Compute cosine distance (1 - cosine similarity) between two spectra.

Parameters:

Name	Type	Description	Default
`a`	`ndarray`	First spectrum (1D or flattened array).	required
`b`	`ndarray`	Second spectrum (1D or flattened array).	required

Returns:

Type	Description
`float`	Cosine distance (1 - cosine similarity).

Robustness Analysis¶

bootstrap_metric¶

Bootstrap confidence intervals for metrics.

Estimate robustness of a metric via bootstrap resampling.

Parameters:

Name	Type	Description	Default
`metric_func`	`Callable[[ndarray, ndarray], float]`	Function taking (y_true, y_pred) and returning a scalar metric.	required
`y_true`	`ndarray`	True targets (array-like).	required
`y_pred`	`ndarray`	Predicted targets (array-like).	required
`n_bootstrap`	`int`	Number of bootstrap samples, by default 1000.	`1000`
`ci`	`tuple[float, float]`	Confidence interval percentiles (lower, upper), by default (2.5, 97.5).	`(2.5, 97.5)`
`random_state`	`Optional[int]`	Seed for reproducibility.	`None`

Returns:

Type	Description
`dict`	Dictionary with keys: 'observed' (original metric), 'bootstrap_samples'
`dict`	(array of bootstrap metrics), and 'ci' (tuple of CI bounds).

permutation_test_metric¶

Permutation test for statistical significance.

Permutation test for a metric by shuffling the target labels.

Parameters¶

metric_func : callable Function taking (y_true, y_pred) and returning a scalar metric. y_true : array-like True targets. y_pred : array-like Predicted targets. n_permutations : int, optional Number of permutations, by default 1000. metric_higher_is_better : bool, optional If False, p-value computation is reversed (e.g., for RMSE), by default True. random_state : int, optional Seed for reproducibility.

Returns¶

dict Contains observed metric, permutation samples, and an empirical p-value.

Statistics API¶

Time Metrics¶

linear_slope¶

quadratic_acceleration¶

rolling_slope¶

Hypothesis Testing¶

run_ttest¶

run_anova¶

run_manova¶

run_kruskal_wallis¶

run_mannwhitney_u¶

Effect Sizes¶

compute_cohens_d¶

compute_anova_effect_sizes¶

Correlations & Distances¶

compute_correlations¶

euclidean_distance¶

cosine_distance¶

Robustness Analysis¶

bootstrap_metric¶

permutation_test_metric¶

Parameters¶

Returns¶

See Also¶