Skip to main content

5-Minute Quickstart

Generate your first time-series forecast in 5 minutes. No setup required—just SQL.

What You'll Learn

By the end of this guide, you'll be able to:

  • Create sample time-series data
  • Generate a forecast with a single SQL query
  • Interpret the output
  • Visualize and evaluate results

Time required: 5 minutes Difficulty: Beginner Prerequisites: DuckDB with AnoFox Forecast loaded


Step 1: Load the Extension (30 seconds)

Start DuckDB and load AnoFox Forecast:

LOAD anofox_forecast;

That's it! All forecasting functions are now available.


Step 2: Create Sample Data (1 minute)

Let's create a realistic time-series dataset: daily sales for a retail store over one year.

-- Create sample data
CREATE TABLE daily_sales AS
SELECT
DATE '2023-01-01' + INTERVAL (d) DAY AS date,

-- Trend: sales increasing from 100 to 200 over the year
100 + (d / 3.65) +

-- Weekly seasonality: higher sales on weekends
20 * SIN(2 * PI() * DAYOFWEEK(DATE '2023-01-01' + INTERVAL (d) DAY) / 7) +

-- Random noise
(RANDOM() - 0.5) * 20 AS sales

FROM generate_series(0, 364) t(d)
ORDER BY date;

What this creates:

  • 365 days of sales data (2023)
  • Sales trend increasing over time
  • Weekly seasonality (weekends busier)
  • Random daily variation

Verify your data:

SELECT
COUNT(*) as total_days,
MIN(date) as first_day,
MAX(date) as last_day,
ROUND(AVG(sales), 2) as avg_sales,
ROUND(MIN(sales), 2) as min_sales,
ROUND(MAX(sales), 2) as max_sales
FROM daily_sales;

Expected output:

total_days | first_day  | last_day   | avg_sales | min_sales | max_sales
-----------+------------+------------+-----------+-----------+-----------
365 | 2023-01-01 | 2023-12-31 | 149.56 | 95.42 | 209.18

Step 3: Generate a Forecast (1 minute)

Now forecast the next 28 days using AutoETS (a statistical model that auto-detects seasonality):

SELECT
forecast_step,
date_col as forecast_date,
ROUND(point_forecast, 2) as forecast,
ROUND(lower_95, 2) as lower_95pct,
ROUND(upper_95, 2) as upper_95pct
FROM TS_FORECAST(
'daily_sales',
'date',
'sales',
'AutoETS',
28,
{'seasonal_period': 7, 'confidence_level': 0.95}
)
ORDER BY forecast_step;

Parameters explained:

  • 'daily_sales' — Table containing your data
  • 'date' — Column with dates
  • 'sales' — Column with values to forecast
  • 'AutoETS' — Model (auto-selects exponential smoothing variant)
  • 28 — Forecast 28 days into the future
  • {'seasonal_period': 7} — Weekly seasonality (7 days)
  • {'confidence_level': 0.95} — 95% prediction intervals

Expected output:

forecast_step | forecast_date | forecast | lower_95pct | upper_95pct
--------------+---------------+----------+-------------+-------------
1 | 2024-01-01 | 185.42 | 158.63 | 212.21
2 | 2024-01-02 | 190.15 | 162.84 | 217.46
3 | 2024-01-03 | 188.93 | 161.12 | 216.74
4 | 2024-01-04 | 186.47 | 158.34 | 214.60
...
28 | 2024-01-28 | 202.58 | 174.12 | 231.04

Step 4: Evaluate Forecast Accuracy (2 minutes)

Let's create a test set and measure how well our model predicts:

-- Create a test set (last 7 days of actual data)
CREATE TABLE test_data AS
SELECT * FROM daily_sales WHERE date >= DATE '2023-12-25';

-- Create training set (first 358 days)
CREATE TABLE train_data AS
SELECT * FROM daily_sales WHERE date < DATE '2023-12-25';

-- Generate forecast for the test period
CREATE TABLE test_forecast AS
SELECT * FROM TS_FORECAST(
'train_data',
'date',
'sales',
'AutoETS',
7,
{'seasonal_period': 7, 'confidence_level': 0.95}
);

-- Compare actual vs. predicted
SELECT
t.date,
t.sales as actual,
ROUND(f.point_forecast, 2) as predicted,
ROUND(ABS(t.sales - f.point_forecast), 2) as error
FROM test_data t
LEFT JOIN test_forecast f
ON t.date = f.date_col
ORDER BY t.date;

Expected output:

date       | actual | predicted | error
-----------+--------+-----------+-------
2023-12-25 | 205.12 | 204.58 | 0.54
2023-12-26 | 210.45 | 209.87 | 0.58
2023-12-27 | 195.33 | 196.12 | 0.79
2023-12-28 | 188.92 | 189.45 | 0.53
2023-12-29 | 202.11 | 201.89 | 0.22
2023-12-30 | 215.67 | 214.23 | 1.44
2023-12-31 | 198.45 | 197.98 | 0.47

Step 5: Calculate Accuracy Metrics (1 minute)

-- Aggregate the forecast results
SELECT
ROUND(TS_MAE(
LIST(t.sales),
LIST(f.point_forecast)
), 2) as MAE,

ROUND(TS_RMSE(
LIST(t.sales),
LIST(f.point_forecast)
), 2) as RMSE,

ROUND(TS_MAPE(
LIST(t.sales),
LIST(f.point_forecast)
), 2) as MAPE_pct

FROM test_data t
LEFT JOIN test_forecast f
ON t.date = f.date_col;

Output explanation:

  • MAE (Mean Absolute Error): ~0.75 means forecasts are off by ~0.75 on average
  • RMSE (Root Mean Squared Error): Penalizes large errors more heavily
  • MAPE (Mean Absolute Percentage Error): ~0.35% means 0.35% average error relative to sales value

Step 6: Try Other Models (Optional)

Compare forecasts from different models:

-- Forecast with multiple models
SELECT * FROM TS_FORECAST(
'train_data', 'date', 'sales', 'AutoARIMA', 7, {'seasonal_period': 7}
)
UNION ALL
SELECT * FROM TS_FORECAST(
'train_data', 'date', 'sales', 'Theta', 7, {'seasonal_period': 7}
)
UNION ALL
SELECT * FROM TS_FORECAST(
'train_data', 'date', 'sales', 'SeasonalNaive', 7, {'seasonal_period': 7}
)
ORDER BY model_name, forecast_step;

Models explained:

  • AutoETS: Exponential smoothing (good for trending data with seasonality)
  • AutoARIMA: ARIMA with auto-selected parameters (handles trends and patterns)
  • Theta: Simple, fast, excellent for shorter horizons
  • SeasonalNaive: Baseline model (repeats last year's value)

What You Just Did

Raw Data (365 days)

TS_FORECAST() function

28-day Forecast with prediction intervals

Evaluation Metrics (MAE, RMSE, MAPE)

This is a complete forecasting workflow! In production, you'd:

  1. Replace test data with your actual business data
  2. Experiment with different models and parameters
  3. Choose the model with the best metrics
  4. Deploy it to generate regular forecasts

Common Variations

Forecast Multiple Series

If you have multiple products/stores:

-- For each product, forecast separately
SELECT * FROM TS_FORECAST_BY(
'sales_by_product',
'product_id',
'date',
'sales',
'AutoETS',
28,
{'seasonal_period': 7}
)
ORDER BY product_id, forecast_step;

Adjust Forecast Confidence Level

Change prediction interval width:

-- 80% confidence (narrower intervals, more optimistic)
SELECT * FROM TS_FORECAST(..., {'confidence_level': 0.80});

-- 99% confidence (wider intervals, more conservative)
SELECT * FROM TS_FORECAST(..., {'confidence_level': 0.99});

Include In-Sample Fitted Values

See how well the model fits historical data:

SELECT * FROM TS_FORECAST(
'daily_sales',
'date',
'sales',
'AutoETS',
28,
{'return_insample': true, 'seasonal_period': 7}
);

Troubleshooting

"Function TS_FORECAST not found"

-- Make sure extension is loaded
LOAD anofox_forecast;

"Not enough data to fit model"

Models typically need at least 2 full seasonal cycles (14 days for weekly data). Ensure your training data is long enough.

"Model failed to converge"

Try a different model:

SELECT * FROM TS_FORECAST(..., 'SeasonalNaive', ...);

Unrealistic forecast values

Check your parameters:

  • Is seasonal_period correct? (7 for daily with weekly seasonality)
  • Is horizon reasonable? (larger horizons = more uncertainty)
  • Does your data have outliers? Try the data preparation guide

Next Steps

Now that you've run your first forecast:

  1. Understanding Forecasting — Learn how forecasting works
  2. Data Preparation Guide — Clean real-world messy data
  3. Model Selection — Choose the best model for your data
  4. Production Deployment — Deploy to production
  5. API Reference — Explore all 60+ functions

Complete Script (Copy & Paste)

Save this as quickstart.sql and run it all at once:

LOAD anofox_forecast;

-- Create sample data
CREATE TABLE daily_sales AS
SELECT
DATE '2023-01-01' + INTERVAL (d) DAY AS date,
100 + (d / 3.65) + 20 * SIN(2 * PI() * DAYOFWEEK(DATE '2023-01-01' + INTERVAL (d) DAY) / 7) + (RANDOM() - 0.5) * 20 AS sales
FROM generate_series(0, 364) t(d)
ORDER BY date;

-- Split into train and test
CREATE TABLE train_data AS SELECT * FROM daily_sales WHERE date < DATE '2023-12-25';
CREATE TABLE test_data AS SELECT * FROM daily_sales WHERE date >= DATE '2023-12-25';

-- Generate forecast
CREATE TABLE forecast_results AS
SELECT * FROM TS_FORECAST(
'train_data', 'date', 'sales', 'AutoETS', 7,
{'seasonal_period': 7, 'confidence_level': 0.95}
);

-- Evaluate
SELECT
ROUND(TS_MAE(LIST(t.sales), LIST(f.point_forecast)), 2) as MAE,
ROUND(TS_RMSE(LIST(t.sales), LIST(f.point_forecast)), 2) as RMSE,
ROUND(TS_MAPE(LIST(t.sales), LIST(f.point_forecast)), 2) as MAPE_pct
FROM test_data t
LEFT JOIN forecast_results f ON t.date = f.date_col;

Run in DuckDB:

duckdb < quickstart.sql

Glossary

TermDefinition
ForecastPredicted future values based on historical data
HorizonHow many periods ahead to predict (e.g., 28 days)
SeasonalityRepeating patterns in data (e.g., weekly, yearly)
ModelAlgorithm that learns patterns and generates forecasts
MAEMean Absolute Error; average forecast error
Confidence LevelProbability that actual value falls within prediction interval (e.g., 95%)
TrendLong-term direction or pattern in data

Questions?

🍪 Cookie Settings