5-Minute Quickstart
Generate your first time-series forecast in 5 minutes. No setup required—just SQL.
What You'll Learn
By the end of this guide, you'll be able to:
- Create sample time-series data
- Generate a forecast with a single SQL query
- Interpret the output
- Visualize and evaluate results
Time required: 5 minutes Difficulty: Beginner Prerequisites: DuckDB with AnoFox Forecast loaded
Step 1: Load the Extension (30 seconds)
Start DuckDB and load AnoFox Forecast:
LOAD anofox_forecast;
That's it! All forecasting functions are now available.
Step 2: Create Sample Data (1 minute)
Let's create a realistic time-series dataset: daily sales for a retail store over one year.
-- Create sample data
CREATE TABLE daily_sales AS
SELECT
DATE '2023-01-01' + INTERVAL (d) DAY AS date,
-- Trend: sales increasing from 100 to 200 over the year
100 + (d / 3.65) +
-- Weekly seasonality: higher sales on weekends
20 * SIN(2 * PI() * DAYOFWEEK(DATE '2023-01-01' + INTERVAL (d) DAY) / 7) +
-- Random noise
(RANDOM() - 0.5) * 20 AS sales
FROM generate_series(0, 364) t(d)
ORDER BY date;
What this creates:
- 365 days of sales data (2023)
- Sales trend increasing over time
- Weekly seasonality (weekends busier)
- Random daily variation
Verify your data:
SELECT
COUNT(*) as total_days,
MIN(date) as first_day,
MAX(date) as last_day,
ROUND(AVG(sales), 2) as avg_sales,
ROUND(MIN(sales), 2) as min_sales,
ROUND(MAX(sales), 2) as max_sales
FROM daily_sales;
Expected output:
total_days | first_day | last_day | avg_sales | min_sales | max_sales
-----------+------------+------------+-----------+-----------+-----------
365 | 2023-01-01 | 2023-12-31 | 149.56 | 95.42 | 209.18
Step 3: Generate a Forecast (1 minute)
Now forecast the next 28 days using AutoETS (a statistical model that auto-detects seasonality):
SELECT
forecast_step,
date_col as forecast_date,
ROUND(point_forecast, 2) as forecast,
ROUND(lower_95, 2) as lower_95pct,
ROUND(upper_95, 2) as upper_95pct
FROM TS_FORECAST(
'daily_sales',
'date',
'sales',
'AutoETS',
28,
{'seasonal_period': 7, 'confidence_level': 0.95}
)
ORDER BY forecast_step;
Parameters explained:
'daily_sales'— Table containing your data'date'— Column with dates'sales'— Column with values to forecast'AutoETS'— Model (auto-selects exponential smoothing variant)28— Forecast 28 days into the future{'seasonal_period': 7}— Weekly seasonality (7 days){'confidence_level': 0.95}— 95% prediction intervals
Expected output:
forecast_step | forecast_date | forecast | lower_95pct | upper_95pct
--------------+---------------+----------+-------------+-------------
1 | 2024-01-01 | 185.42 | 158.63 | 212.21
2 | 2024-01-02 | 190.15 | 162.84 | 217.46
3 | 2024-01-03 | 188.93 | 161.12 | 216.74
4 | 2024-01-04 | 186.47 | 158.34 | 214.60
...
28 | 2024-01-28 | 202.58 | 174.12 | 231.04
Step 4: Evaluate Forecast Accuracy (2 minutes)
Let's create a test set and measure how well our model predicts:
-- Create a test set (last 7 days of actual data)
CREATE TABLE test_data AS
SELECT * FROM daily_sales WHERE date >= DATE '2023-12-25';
-- Create training set (first 358 days)
CREATE TABLE train_data AS
SELECT * FROM daily_sales WHERE date < DATE '2023-12-25';
-- Generate forecast for the test period
CREATE TABLE test_forecast AS
SELECT * FROM TS_FORECAST(
'train_data',
'date',
'sales',
'AutoETS',
7,
{'seasonal_period': 7, 'confidence_level': 0.95}
);
-- Compare actual vs. predicted
SELECT
t.date,
t.sales as actual,
ROUND(f.point_forecast, 2) as predicted,
ROUND(ABS(t.sales - f.point_forecast), 2) as error
FROM test_data t
LEFT JOIN test_forecast f
ON t.date = f.date_col
ORDER BY t.date;
Expected output:
date | actual | predicted | error
-----------+--------+-----------+-------
2023-12-25 | 205.12 | 204.58 | 0.54
2023-12-26 | 210.45 | 209.87 | 0.58
2023-12-27 | 195.33 | 196.12 | 0.79
2023-12-28 | 188.92 | 189.45 | 0.53
2023-12-29 | 202.11 | 201.89 | 0.22
2023-12-30 | 215.67 | 214.23 | 1.44
2023-12-31 | 198.45 | 197.98 | 0.47
Step 5: Calculate Accuracy Metrics (1 minute)
-- Aggregate the forecast results
SELECT
ROUND(TS_MAE(
LIST(t.sales),
LIST(f.point_forecast)
), 2) as MAE,
ROUND(TS_RMSE(
LIST(t.sales),
LIST(f.point_forecast)
), 2) as RMSE,
ROUND(TS_MAPE(
LIST(t.sales),
LIST(f.point_forecast)
), 2) as MAPE_pct
FROM test_data t
LEFT JOIN test_forecast f
ON t.date = f.date_col;
Output explanation:
- MAE (Mean Absolute Error): ~0.75 means forecasts are off by ~0.75 on average
- RMSE (Root Mean Squared Error): Penalizes large errors more heavily
- MAPE (Mean Absolute Percentage Error): ~0.35% means 0.35% average error relative to sales value
Step 6: Try Other Models (Optional)
Compare forecasts from different models:
-- Forecast with multiple models
SELECT * FROM TS_FORECAST(
'train_data', 'date', 'sales', 'AutoARIMA', 7, {'seasonal_period': 7}
)
UNION ALL
SELECT * FROM TS_FORECAST(
'train_data', 'date', 'sales', 'Theta', 7, {'seasonal_period': 7}
)
UNION ALL
SELECT * FROM TS_FORECAST(
'train_data', 'date', 'sales', 'SeasonalNaive', 7, {'seasonal_period': 7}
)
ORDER BY model_name, forecast_step;
Models explained:
- AutoETS: Exponential smoothing (good for trending data with seasonality)
- AutoARIMA: ARIMA with auto-selected parameters (handles trends and patterns)
- Theta: Simple, fast, excellent for shorter horizons
- SeasonalNaive: Baseline model (repeats last year's value)
What You Just Did
Raw Data (365 days)
↓
TS_FORECAST() function
↓
28-day Forecast with prediction intervals
↓
Evaluation Metrics (MAE, RMSE, MAPE)
This is a complete forecasting workflow! In production, you'd:
- Replace test data with your actual business data
- Experiment with different models and parameters
- Choose the model with the best metrics
- Deploy it to generate regular forecasts
Common Variations
Forecast Multiple Series
If you have multiple products/stores:
-- For each product, forecast separately
SELECT * FROM TS_FORECAST_BY(
'sales_by_product',
'product_id',
'date',
'sales',
'AutoETS',
28,
{'seasonal_period': 7}
)
ORDER BY product_id, forecast_step;
Adjust Forecast Confidence Level
Change prediction interval width:
-- 80% confidence (narrower intervals, more optimistic)
SELECT * FROM TS_FORECAST(..., {'confidence_level': 0.80});
-- 99% confidence (wider intervals, more conservative)
SELECT * FROM TS_FORECAST(..., {'confidence_level': 0.99});
Include In-Sample Fitted Values
See how well the model fits historical data:
SELECT * FROM TS_FORECAST(
'daily_sales',
'date',
'sales',
'AutoETS',
28,
{'return_insample': true, 'seasonal_period': 7}
);
Troubleshooting
"Function TS_FORECAST not found"
-- Make sure extension is loaded
LOAD anofox_forecast;
"Not enough data to fit model"
Models typically need at least 2 full seasonal cycles (14 days for weekly data). Ensure your training data is long enough.
"Model failed to converge"
Try a different model:
SELECT * FROM TS_FORECAST(..., 'SeasonalNaive', ...);
Unrealistic forecast values
Check your parameters:
- Is
seasonal_periodcorrect? (7 for daily with weekly seasonality) - Is
horizonreasonable? (larger horizons = more uncertainty) - Does your data have outliers? Try the data preparation guide
Next Steps
Now that you've run your first forecast:
- Understanding Forecasting — Learn how forecasting works
- Data Preparation Guide — Clean real-world messy data
- Model Selection — Choose the best model for your data
- Production Deployment — Deploy to production
- API Reference — Explore all 60+ functions
Complete Script (Copy & Paste)
Save this as quickstart.sql and run it all at once:
LOAD anofox_forecast;
-- Create sample data
CREATE TABLE daily_sales AS
SELECT
DATE '2023-01-01' + INTERVAL (d) DAY AS date,
100 + (d / 3.65) + 20 * SIN(2 * PI() * DAYOFWEEK(DATE '2023-01-01' + INTERVAL (d) DAY) / 7) + (RANDOM() - 0.5) * 20 AS sales
FROM generate_series(0, 364) t(d)
ORDER BY date;
-- Split into train and test
CREATE TABLE train_data AS SELECT * FROM daily_sales WHERE date < DATE '2023-12-25';
CREATE TABLE test_data AS SELECT * FROM daily_sales WHERE date >= DATE '2023-12-25';
-- Generate forecast
CREATE TABLE forecast_results AS
SELECT * FROM TS_FORECAST(
'train_data', 'date', 'sales', 'AutoETS', 7,
{'seasonal_period': 7, 'confidence_level': 0.95}
);
-- Evaluate
SELECT
ROUND(TS_MAE(LIST(t.sales), LIST(f.point_forecast)), 2) as MAE,
ROUND(TS_RMSE(LIST(t.sales), LIST(f.point_forecast)), 2) as RMSE,
ROUND(TS_MAPE(LIST(t.sales), LIST(f.point_forecast)), 2) as MAPE_pct
FROM test_data t
LEFT JOIN forecast_results f ON t.date = f.date_col;
Run in DuckDB:
duckdb < quickstart.sql
Glossary
| Term | Definition |
|---|---|
| Forecast | Predicted future values based on historical data |
| Horizon | How many periods ahead to predict (e.g., 28 days) |
| Seasonality | Repeating patterns in data (e.g., weekly, yearly) |
| Model | Algorithm that learns patterns and generates forecasts |
| MAE | Mean Absolute Error; average forecast error |
| Confidence Level | Probability that actual value falls within prediction interval (e.g., 95%) |
| Trend | Long-term direction or pattern in data |