15-Minute Quickstart¶
Purpose: Get FoodSpec working on your computer and run your first analysis in 15 minutes.
Audience: Complete beginners with zero spectroscopy experience.
Time: 15 minutes exactly.
Prerequisites: Python 3.10+; basic terminal/command-line knowledge; no chemistry background needed.
What you'll accomplish: 1. Install FoodSpec 2. Download sample oil spectroscopy data 3. Run oil authentication analysis 4. View results (confusion matrix, metrics)
What You'll Learn¶
By the end of this guide, you will: 1. Install FoodSpec on your computer 2. Download a sample food dataset (cooking oils) 3. Run oil authentication analysis 4. View results (confusion matrix, ROC curve, metrics)
No equations. No chemistry jargon. Just working code.
Step 1: Install FoodSpec¶
β±οΈ 2 minutes
Open your terminal (Mac/Linux) or Command Prompt (Windows) and type:
pip install foodspec
What this does: Downloads FoodSpec and all the tools it needs to analyze food samples.
Check it worked:
foodspec --version
Expected output:
FoodSpec version 1.0.0
If you see an error, verify Python 3.10+ is installed:
python --version
Step 2: Get Sample Data¶
β±οΈ 1 minute
FoodSpec includes example data. Download the oil authentication dataset:
# Create a working directory
mkdir foodspec-quickstart
cd foodspec-quickstart
# Download sample data (cooking oils Raman spectra)
curl -O https://raw.githubusercontent.com/chandrasekarnarayana/foodspec/main/examples/data/oils.csv
What's in this file?
A spreadsheet (CSV) with measurements from 4 types of cooking oils:
- Olive oil (OO)
- Palm oil (PO)
- Sunflower oil (VO)
- Coconut oil (CO)
Each row is a "spectrum" (a measurement showing what molecules are in the oil). The computer can learn patterns to tell them apart.
Step 3: Run Your First Analysis¶
β±οΈ 3 minutes
Option A: CLI (Reproducible, No Code)¶
Run the oil authentication workflow:
foodspec oil-auth \
--input oils.csv \
--output-dir my_first_run
What this command does:
1. Reads the oil measurements from oils.csv
2. Cleans the data (removes noise/background)
3. Trains a computer to recognize each oil type
4. Tests accuracy with cross-validation (checks it's not cheating)
5. Generates a report with charts
Expected output:
β Loaded 120 spectra (30 per oil type)
β Preprocessing completed
β Classification accuracy: 94.5%
β Report saved to: my_first_run/report.html
Option B: Python API (Interactive, Full Control)¶
If you prefer Python, save this as quickstart.py:
from foodspec.datasets import load_oil_example_data
from foodspec.preprocess import baseline_als, normalize_snv
from foodspec.ml import ClassifierFactory
from foodspec.validation import run_stratified_cv
# Load built-in oil dataset
spectra = load_oil_example_data()
print(f"β
Step 1: Loaded {len(spectra)} oil spectra")
# Preprocess
spectra = baseline_als(spectra)
spectra = normalize_snv(spectra)
print(f"β
Step 2: Preprocessing complete")
# Train & validate
model = ClassifierFactory.create("random_forest", n_estimators=100)
metrics = run_stratified_cv(model, spectra.data, spectra.labels, cv=5)
print(f"β
Step 3: Training complete")
print(f" Accuracy: {metrics['accuracy']:.1%}")
print(f" Balanced Accuracy: {metrics['balanced_accuracy']:.1%}")
Then run it:
python quickstart.py
Expected output:
β
Step 1: Loaded 96 oil spectra
β
Step 2: Preprocessing complete
β
Step 3: Training complete
Accuracy: 95.2%
Balanced Accuracy: 94.8%
Why both options? - CLI: Best for reproducibility and automation (record exact command, share with colleagues) - Python API: Best for learning and customization (change parameters, inspect internals)
Step 4: View Your Results (5 minutes)¶
Open the report in your web browser:
# Mac/Linux
open my_first_run/report.html
# Windows
start my_first_run/report.html
# Or manually open the file in your browser
What You're Looking At¶
1. Confusion Matrix (Top Left Chart)
A grid showing how often the computer was correct:
- Diagonal (green): Correct predictions (e.g., olive oil identified as olive oil)
- Off-diagonal (red): Mistakes (e.g., palm oil misclassified as sunflower oil)
Goal: All the green should be on the diagonal. If there's red, the computer confused two oils.
2. Accuracy Number
Example: "Balanced Accuracy: 94.5%"
This means the computer correctly identified the oil type 94.5% of the time.
- >90% = Excellent (you can trust this for quality control)
- 70-90% = Good (usable but may need more data)
- <70% = Poor (oils are too similar or data is noisy)
3. Top Discriminative Features (Bar Chart)
Shows which molecular "fingerprints" differ most between oils.
- Higher bars = more important for telling oils apart
- Example: "Ratio 1650/2900" means the ratio of two specific chemical signals
You don't need to understand the chemistry. Just know: higher bars = more reliable markers.
4. Minimal Panel (Bottom Right)
The smallest set of measurements needed to identify oils accurately.
- Example: "3 features achieve 92% accuracy"
- Why this matters: In a real lab, you'd only measure 3 things instead of 100 (saves time/cost)
Step 5: What Just Happened? (Layer 1 Explanation)¶
Here's what FoodSpec did behind the scenes, explained like you're 10 years old:
- Loaded data: Opened the spreadsheet with oil measurements
- Cleaned it: Removed background noise (like adjusting TV antenna for clear signal)
- Extracted patterns: Found which signals differ between oils (like comparing fingerprints)
- Trained a classifier: Taught the computer to recognize each oil type (like showing a dog pictures until it learns "cat" vs "dog")
- Validated results: Tested on data it hadn't seen before to make sure it's not memorizing (like a pop quiz)
- Made a report: Generated charts to show you what it learned
Key insight: You're not doing chemistryβyou're pattern matching with machines. The computer finds differences you can't see by eye.
What's Next?¶
Learn More About Data and Terminology¶
- Data Format Reference - How to prepare your own CSV files (schema, validation checklist)
- Glossary - What do terms like "wavenumber", "baseline", "CV strategy" mean?
- Python Quickstart - Run analyses from Python code (for customization)
- CLI Quickstart - More detailed command-line examples
If You Want to Learn More¶
Option A: Run Another Example (Beginner)
Try the heating quality tutorial to see how oils degrade when fried:
foodspec heating --help
Option B: Use Your Own Data (Intermediate)
You need a CSV file with:
- One column for sample type (e.g., oil_type)
- Columns with measurements (wavenumbers 400-4000 cmβ»ΒΉ for Raman)
See data_formats_and_hdf5.md for format requirements.
Option C: Understand the Science (Advanced)
Read spectroscopy_basics.md to learn how Raman spectroscopy works.
Option D: Customize Workflows (Expert)
Create your own analysis protocols with YAML:
See protocols_and_yaml.md.
Common Problems (and Fixes)¶
"Command not found: foodspec"¶
Cause: FoodSpec not installed or not in PATH
Fix:
pip install --upgrade foodspec
# Verify installation:
pip show foodspec
"File not found: oils.csv"¶
Cause: Didn't download the sample data or wrong directory
Fix:
# Re-download sample data:
curl -O https://raw.githubusercontent.com/chandrasekarnarayana/foodspec/main/examples/data/oils.csv
# Verify it's there:
ls -l oils.csv
"Accuracy <50%"¶
Cause: Data quality issues or oil types are too similar
Fix:
1. Check your data has at least 20 samples per oil type
2. Verify wavenumber range includes 1600-1800 cmβ»ΒΉ (carbonyl region)
3. See troubleshooting for data quality checks
"No module named 'foodspec'"¶
Cause: Wrong Python environment active
Fix:
# Check which Python you're using:
which python
python -m pip install foodspec
Success Checklist¶
Before moving to the next tutorial, verify:
- [ ]
foodspec --helpshows command list - [ ] Sample data downloaded (
ls oils.csvworks) - [ ]
foodspec oil-authcompleted without errors - [ ] Report opened in browser showing >90% accuracy
- [ ] Confusion matrix mostly green on diagonal
All checked? You're ready for real data! π
FAQ (Absolute Beginner Edition)¶
Q: Do I need to know chemistry?
A: No. FoodSpec does the chemistry. You just need to run commands and read reports.
Q: What equipment do I need?
A: A Raman or FTIR spectrometer. If you have CSV files of spectra, you already have the data.
Q: Can I use this for foods other than oils?
A: Yes! FoodSpec works on any food with spectroscopy data. Common uses: dairy, honey, spices, beverages.
Q: Is this suitable for regulatory/legal use?
A: FoodSpec provides research-grade tools. For regulatory compliance, see validation_strategies.md for proper validation protocols.
Q: How do I cite FoodSpec in a paper?
A: See the FAQ for citation details.
What You've Achieved¶
β
Installed FoodSpec
β
Ran your first food authentication analysis
β
Interpreted a classification report
β
Understood the basic workflow (load β clean β analyze β report)
Next recommended tutorial: Oil Discrimination (Basic) to learn what each step does in detail.
Need Help?¶
- Stuck on an error? β Troubleshooting Guide β Installation issues, NaNs, shape errors, and more
- Have questions? β FAQ β Baseline methods, sample size, citations
- Report a bug: GitHub Issues