Skip to main content

Conformal Prediction

AnoFox provides 11 conformal prediction functions -- 7 scalar functions and 4 table macros -- for generating distribution-free prediction intervals with guaranteed coverage. Unlike parametric intervals that assume normal residuals, conformal prediction provides finite-sample valid coverage for any underlying distribution. The system supports both symmetric and asymmetric intervals (for skewed residuals common in demand forecasting), and includes evaluation via coverage rate, violation rate, mean interval width, and Winkler score.

Conformal prediction is a statistical framework that provides distribution-free prediction intervals with guaranteed coverage probability. Unlike parametric methods that assume residuals follow a specific distribution (e.g., normal), conformal prediction makes minimal assumptions about the underlying distribution and provides valid coverage even for finite samples.

FunctionDescriptionType
ts_conformal_quantileCompute conformity score from residualsScalar
ts_conformal_intervalsApply conformity score to create intervalsScalar
ts_conformal_predictFull split conformal predictionScalar
ts_conformal_predict_asymmetricAsymmetric conformal for skewed residualsScalar
ts_mean_interval_widthCompute mean width of prediction intervalsScalar
ts_conformal_coverageEmpirical coverage evaluationScalar
ts_conformal_evaluateFull evaluation (coverage, width, Winkler)Scalar
ts_conformal_byHigh-level grouped conformal predictionTable Macro
ts_conformal_calibrateCalibrate conformity score from backtest tableTable Macro
ts_conformal_apply_byApply pre-computed score to forecast tableTable Macro
ts_interval_width_byCompute interval widths for grouped resultsTable Macro
Showing 11 of 11

How It Works

The conformal prediction system operates in two phases:

  1. Calibration: Compute a conformity score from calibration residuals (actual - predicted)
  2. Prediction: Apply the conformity score to new forecasts to create prediction intervals

The resulting intervals will cover the true value with probability at least 1 - alpha, where alpha is the miscoverage rate.


Scalar Functions

ts_conformal_quantile

Computes the empirical quantile of absolute residuals for split conformal prediction.

ts_conformal_quantile(residuals DOUBLE[], alpha DOUBLE)DOUBLE

Parameters:

  • residuals: Array of residuals (actual - predicted) from calibration set
  • alpha: Miscoverage rate (0 < alpha < 1). Use 0.1 for 90% coverage, 0.05 for 95% coverage.

Returns: DOUBLE - The conformity score (quantile of absolute residuals)

Example:

SELECT ts_conformal_quantile(
[1.0, -0.5, 2.0, -1.5, 0.8],
0.1
) AS conformity_score;

ts_conformal_intervals

Applies a pre-computed conformity score to create symmetric prediction intervals.

ts_conformal_intervals(forecasts DOUBLE[], conformity_score DOUBLE)
→ STRUCT(lower DOUBLE[], upper DOUBLE[])

Parameters:

  • forecasts: Array of point forecasts
  • conformity_score: Pre-computed quantile from ts_conformal_quantile

Returns: STRUCT containing:

  • lower: Array of lower interval bounds
  • upper: Array of upper interval bounds

Example:

SELECT
(ts_conformal_intervals([100.0, 110.0, 120.0], 5.0)).lower AS lower,
(ts_conformal_intervals([100.0, 110.0, 120.0], 5.0)).upper AS upper;
-- Returns: lower = [95.0, 105.0, 115.0], upper = [105.0, 115.0, 125.0]

ts_conformal_predict

Full split conformal prediction: computes conformity score from residuals and applies to forecasts.

ts_conformal_predict(residuals DOUBLE[], forecasts DOUBLE[], alpha DOUBLE)
→ STRUCT(
point DOUBLE[],
lower DOUBLE[],
upper DOUBLE[],
coverage DOUBLE,
conformity_score DOUBLE,
method VARCHAR
)

Parameters:

  • residuals: Calibration residuals (actual - predicted)
  • forecasts: Point forecasts to generate intervals for
  • alpha: Miscoverage rate

Example:

WITH backtest_residuals AS (
SELECT [1.2, -0.8, 1.5, -1.0, 0.5]::DOUBLE[] AS residuals
),
future_forecasts AS (
SELECT [100.0, 102.0, 104.0]::DOUBLE[] AS forecasts
)
SELECT
(ts_conformal_predict(residuals, forecasts, 0.1)).*
FROM backtest_residuals, future_forecasts;

ts_conformal_predict_asymmetric

Asymmetric conformal prediction that computes separate upper and lower quantiles. Useful when residuals are not symmetric (e.g., skewed demand forecasts).

ts_conformal_predict_asymmetric(residuals DOUBLE[], forecasts DOUBLE[], alpha DOUBLE)
→ STRUCT(
point DOUBLE[],
lower DOUBLE[],
upper DOUBLE[],
coverage DOUBLE,
conformity_score DOUBLE,
method VARCHAR
)

Example:

-- Asymmetric residuals (over-predictions more common)
SELECT
(ts_conformal_predict_asymmetric(
[-0.5, -0.3, 0.2, 1.0, 2.5],
[100.0, 110.0],
0.1
)).*;

ts_mean_interval_width

Computes the mean width of prediction intervals. Useful for comparing interval sharpness across models.

ts_mean_interval_width(lower DOUBLE[], upper DOUBLE[])DOUBLE

Example:

SELECT
ts_mean_interval_width([95.0, 105.0], [105.0, 115.0]) AS model_a_width,
ts_mean_interval_width([90.0, 100.0], [110.0, 120.0]) AS model_b_width;
-- Model A: 10.0 (sharper), Model B: 20.0 (wider)

ts_conformal_coverage

Computes empirical coverage: the fraction of actual values that fall within the prediction intervals.

ts_conformal_coverage(actuals DOUBLE[], lower DOUBLE[], upper DOUBLE[])DOUBLE

Example:

SELECT ts_conformal_coverage(
[100.0, 105.0, 110.0],
[95.0, 100.0, 105.0],
[105.0, 110.0, 115.0]
) AS empirical_coverage;
-- Returns: 1.0 (all actuals within intervals)

ts_conformal_evaluate

Full evaluation of prediction intervals: coverage, violation rate, mean width, and Winkler score.

ts_conformal_evaluate(actuals DOUBLE[], lower DOUBLE[], upper DOUBLE[], alpha DOUBLE)
→ STRUCT(
coverage DOUBLE,
violation_rate DOUBLE,
mean_width DOUBLE,
winkler_score DOUBLE,
n_observations INTEGER
)

Example:

SELECT (ts_conformal_evaluate(
[100.0, 105.0, 110.0, 95.0],
[97.0, 102.0, 107.0, 92.0],
[103.0, 108.0, 113.0, 98.0],
0.1
)).*;

Return fields:

  • coverage: Fraction of actuals within intervals (target: 1 - alpha)
  • violation_rate: Fraction of actuals outside intervals
  • mean_width: Average interval width (narrower = sharper)
  • winkler_score: Combined width + penalty for violations (lower = better)
  • n_observations: Number of observations evaluated

Table Macros

ts_conformal_by

High-level macro that performs conformal prediction on grouped backtest results.

ts_conformal_by(
backtest_results VARCHAR,
group_col COLUMN,
actual_col COLUMN,
forecast_col COLUMN,
point_forecast_col COLUMN,
params STRUCT
)TABLE

Params Options:

KeyTypeDefaultDescription
alphaDOUBLE0.1Miscoverage rate
methodVARCHAR'split''split' or 'asymmetric'

Example:

SELECT * FROM ts_conformal_by(
'backtest_results',
product_id,
actual,
forecast,
point_forecast,
{'alpha': 0.1, 'method': 'split'}
);

ts_conformal_calibrate

Calibrates a conformity score from backtest residuals.

ts_conformal_calibrate(
backtest_results VARCHAR,
actual_col COLUMN,
forecast_col COLUMN,
params STRUCT
)TABLE(conformity_score DOUBLE, coverage DOUBLE, n_residuals INTEGER)

Returns: Single row table with:

  • conformity_score: Computed quantile for intervals
  • coverage: Theoretical coverage probability
  • n_residuals: Number of residuals used

Example:

SELECT * FROM ts_conformal_calibrate(
'backtest_results',
actual,
forecast,
{'alpha': 0.05}
);

ts_conformal_apply_by

Applies a pre-computed conformity score to grouped forecast results.

ts_conformal_apply_by(
forecast_results VARCHAR,
group_col COLUMN,
forecast_col COLUMN,
conformity_score DOUBLE
)TABLE

Example:

-- Two-step workflow: calibrate then apply
CREATE TABLE calibration AS
SELECT * FROM ts_conformal_calibrate('backtest', actual, forecast, {'alpha': 0.1});

SELECT * FROM ts_conformal_apply_by(
'future_forecasts',
product_id,
yhat,
(SELECT conformity_score FROM calibration)
);

ts_interval_width_by

Computes mean interval width for grouped forecast results.

ts_interval_width_by(
results VARCHAR,
group_col COLUMN,
lower_col COLUMN,
upper_col COLUMN
)TABLE(group_col, mean_width DOUBLE, n_intervals BIGINT)

Example:

SELECT * FROM ts_interval_width_by(
'forecast_results',
product_id,
lower,
upper
);

Complete Workflow

-- Step 1: Generate backtest forecasts
CREATE TABLE cv_folds AS
SELECT * FROM ts_cv_folds_by('sales_data', store_id, date, revenue, 5, 7, MAP{});

CREATE TABLE backtest_results AS
SELECT * FROM ts_cv_forecast_by('cv_folds', store_id, date, revenue, 'AutoETS',
MAP{'seasonal_period': '7'});

-- Step 2: Calibrate conformity score (90% coverage)
CREATE TABLE calibration AS
SELECT * FROM ts_conformal_calibrate(
'backtest_results',
y,
yhat,
{'alpha': 0.1}
);

-- Step 3: Generate future forecasts
CREATE TABLE future_forecasts AS
SELECT * FROM ts_forecast_by(
'sales_data', store_id, date, revenue,
'AutoETS', 7, '1d', MAP{'seasonal_period': '7'}
);

-- Step 4: Apply conformal intervals
SELECT * FROM ts_conformal_apply_by(
'future_forecasts',
store_id,
yhat,
(SELECT conformity_score FROM calibration)
);

-- Step 5: Evaluate interval quality
SELECT (ts_conformal_evaluate(
LIST(y ORDER BY ds),
LIST(yhat_lower ORDER BY ds),
LIST(yhat_upper ORDER BY ds),
0.1
)).*
FROM backtest_results;

When to Use Conformal Prediction

ScenarioRecommendation
Need guaranteed coverageUse conformal prediction
Non-normal residualsUse asymmetric conformal
Comparing model uncertaintyUse ts_mean_interval_width
Production intervalsCalibrate on backtest, apply to future
Small calibration setUse conformal (valid for any n)
Evaluating interval qualityUse ts_conformal_evaluate for Winkler scores

Comparison with Parametric Intervals

AspectParametricConformal
Distribution assumptionRequires normalityDistribution-free
Coverage guaranteeAsymptoticFinite-sample valid
Interval shapeAlways symmetricCan be asymmetric
Calibration dataNot requiredRequired
ComputationAnalyticalEmpirical quantiles

The practical advantage of conformal prediction is significant for S&OP planning: parametric intervals assume normal residuals, which rarely holds for real-world demand data that exhibits skewness, heavy tails, and heteroscedasticity. Conformal prediction sidesteps these assumptions entirely. The 4-step production workflow -- backtest with cross-validation, calibrate a conformity score, generate future forecasts, apply conformal intervals -- runs as 4 SQL statements and produces intervals with mathematically guaranteed coverage for any forecast model in the AnoFox library.


Frequently Asked Questions

What does "distribution-free" mean in practice?

Distribution-free means conformal prediction does not assume your forecast errors follow a normal (Gaussian) distribution. It uses the empirical distribution of calibration residuals to construct intervals. This makes it valid for skewed, heavy-tailed, or otherwise non-standard error distributions that are common in real-world demand forecasting.

How many calibration residuals do I need?

Conformal prediction is theoretically valid for any sample size, but practical accuracy improves with more calibration data. For 90% coverage (alpha=0.1), at least 20-30 residuals give reasonable results. For 95% coverage (alpha=0.05), aim for 50+ residuals. Use ts_conformal_evaluate to verify that empirical coverage matches the target.

When should I use asymmetric vs. symmetric conformal prediction?

Use symmetric conformal (ts_conformal_predict) when your forecast errors are roughly symmetric around zero. Use asymmetric conformal (ts_conformal_predict_asymmetric) when errors are skewed, which is common in demand forecasting where over-predictions and under-predictions have different magnitudes. Asymmetric intervals produce tighter, more realistic bounds for skewed data.

What is the Winkler score and how do I interpret it?

The Winkler score is a combined metric that penalizes both interval width and coverage violations. A narrow interval that covers all actuals gets a low (good) score. Violations incur a penalty proportional to how far the actual falls outside the interval. Lower Winkler scores indicate better interval quality. Use it to compare different models' prediction intervals.

🍪 Cookie Settings