Baseline Models
AnoFox includes 4 baseline models -- Naive, SeasonalNaive, SMA, and RandomWalkDrift -- that serve as essential performance benchmarks. A sophisticated forecasting model that cannot outperform a simple baseline (MASE > 1.0) is adding complexity without value. These models execute near-instantly because they require no parameter estimation, making them the fastest forecasters in the AnoFox library.
Baseline models are simple, assumption-free forecasting methods that serve as performance benchmarks. No forecasting model should be deployed without first outperforming simple baselines.
| Model | Description |
|---|---|
Naive | Repeats the last observed value |
SeasonalNaive | Repeats values from previous seasonal cycle |
SMA | Simple Moving Average |
RandomWalkDrift | Random walk with drift (trend) |
Naive
Repeats the last observed value for all forecast horizons.
Example
SELECT * FROM ts_forecast_by(
'sales_data', NULL, date, sales,
'Naive', 14, '1d',
MAP{}
);
Best for: Establishing baseline, random walk data.
SeasonalNaive
Repeats values from the same period in the previous seasonal cycle.
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
seasonal_period | INTEGER | Yes | Seasonal period |
Example
-- Weekly seasonality: forecast = same day last week
SELECT * FROM ts_forecast_by(
'weekly_sales', NULL, date, sales,
'SeasonalNaive', 28, '1d',
MAP{'seasonal_period': '7'}
);
Best for: Strong seasonal patterns, limited data.
SMA (Simple Moving Average)
Average of the last N observations.
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
window | INTEGER | 5 | Number of observations to average |
Example
SELECT * FROM ts_forecast_by(
'noisy_data', NULL, date, value,
'SMA', 14, '1d',
MAP{'window': '7'}
);
Best for: Smoothing noise, simple baseline.
RandomWalkDrift
Random walk with drift (trend). Adds average historical change to the last value.
Example
SELECT * FROM ts_forecast_by(
'trending_data', NULL, date, value,
'RandomWalkDrift', 28, '1d',
MAP{}
);
Best for: Trending data baseline, financial data.
Comparison
| Model | Handles Trend | Handles Seasonality | Speed |
|---|---|---|---|
| Naive | No | No | Fastest |
| SeasonalNaive | No | Yes | Fast |
| SMA | Smoothed | No | Fast |
| RandomWalkDrift | Yes | No | Fast |
When to Use Baseline Models
| Scenario | Recommended |
|---|---|
| Establish performance baseline | Naive |
| Strong weekly/yearly patterns | SeasonalNaive |
| Noisy stationary data | SMA |
| Trending random walk | RandomWalkDrift |
| Very limited data | SeasonalNaive |
| Real-time quick forecast | Naive |
Using Baselines for Comparison
Always compare your sophisticated model against baselines:
-- Compare AutoETS against SeasonalNaive
CREATE TABLE baseline AS
SELECT * FROM ts_forecast_by(
'sales', NULL, date, value,
'SeasonalNaive', 28, '1d',
MAP{'seasonal_period': '7'}
);
CREATE TABLE model_forecast AS
SELECT * FROM ts_forecast_by(
'sales', NULL, date, value,
'AutoETS', 28, '1d',
MAP{}
);
-- Calculate MASE (< 1 means model beats baseline)
SELECT ts_mase(
LIST(actual ORDER BY date),
LIST(model_pred ORDER BY date),
LIST(baseline_pred ORDER BY date)
) AS mase
FROM comparison_data;
If MASE < 1, your model beats the baseline.
Frequently Asked Questions
Why should I always compare against a baseline model?
A baseline establishes the minimum accuracy bar. If a sophisticated model like AutoARIMA or TBATS cannot beat SeasonalNaive, it means the model is adding complexity without improving predictions. MASE (Mean Absolute Scaled Error) directly measures this: MASE < 1 means the model outperforms the baseline, MASE > 1 means it does not.
Which baseline should I use: Naive or SeasonalNaive?
Use Naive for data without clear seasonal patterns (random walk, financial data). Use SeasonalNaive for data with repeating patterns (weekly sales cycles, yearly seasonality). SeasonalNaive is the stronger baseline for most business time series because it captures the most obvious pattern -- last week's same day.
Can baseline models produce prediction intervals?
Yes. All AnoFox forecast models, including baselines, return yhat_lower and yhat_upper columns with prediction intervals. For baselines, these intervals are derived from historical residual variance. For tighter, distribution-free intervals, apply conformal prediction on top of baseline forecasts.