Conformal Prediction
AnoFox provides 11 conformal prediction functions -- 7 scalar functions and 4 table macros -- for generating distribution-free prediction intervals with guaranteed coverage. Unlike parametric intervals that assume normal residuals, conformal prediction provides finite-sample valid coverage for any underlying distribution. The system supports both symmetric and asymmetric intervals (for skewed residuals common in demand forecasting), and includes evaluation via coverage rate, violation rate, mean interval width, and Winkler score.
Conformal prediction is a statistical framework that provides distribution-free prediction intervals with guaranteed coverage probability. Unlike parametric methods that assume residuals follow a specific distribution (e.g., normal), conformal prediction makes minimal assumptions about the underlying distribution and provides valid coverage even for finite samples.
| Function | Description | Type |
|---|---|---|
ts_conformal_quantile | Compute conformity score from residuals | Scalar |
ts_conformal_intervals | Apply conformity score to create intervals | Scalar |
ts_conformal_predict | Full split conformal prediction | Scalar |
ts_conformal_predict_asymmetric | Asymmetric conformal for skewed residuals | Scalar |
ts_mean_interval_width | Compute mean width of prediction intervals | Scalar |
ts_conformal_coverage | Empirical coverage evaluation | Scalar |
ts_conformal_evaluate | Full evaluation (coverage, width, Winkler) | Scalar |
ts_conformal_by | High-level grouped conformal prediction | Table Macro |
ts_conformal_calibrate | Calibrate conformity score from backtest table | Table Macro |
ts_conformal_apply_by | Apply pre-computed score to forecast table | Table Macro |
ts_interval_width_by | Compute interval widths for grouped results | Table Macro |
How It Works
The conformal prediction system operates in two phases:
- Calibration: Compute a conformity score from calibration residuals (actual - predicted)
- Prediction: Apply the conformity score to new forecasts to create prediction intervals
The resulting intervals will cover the true value with probability at least 1 - alpha, where alpha is the miscoverage rate.
Scalar Functions
ts_conformal_quantile
Computes the empirical quantile of absolute residuals for split conformal prediction.
ts_conformal_quantile(residuals DOUBLE[], alpha DOUBLE) → DOUBLE
Parameters:
residuals: Array of residuals (actual - predicted) from calibration setalpha: Miscoverage rate (0 < alpha < 1). Use 0.1 for 90% coverage, 0.05 for 95% coverage.
Returns: DOUBLE - The conformity score (quantile of absolute residuals)
Example:
SELECT ts_conformal_quantile(
[1.0, -0.5, 2.0, -1.5, 0.8],
0.1
) AS conformity_score;
ts_conformal_intervals
Applies a pre-computed conformity score to create symmetric prediction intervals.
ts_conformal_intervals(forecasts DOUBLE[], conformity_score DOUBLE)
→ STRUCT(lower DOUBLE[], upper DOUBLE[])
Parameters:
forecasts: Array of point forecastsconformity_score: Pre-computed quantile fromts_conformal_quantile
Returns: STRUCT containing:
lower: Array of lower interval boundsupper: Array of upper interval bounds
Example:
SELECT
(ts_conformal_intervals([100.0, 110.0, 120.0], 5.0)).lower AS lower,
(ts_conformal_intervals([100.0, 110.0, 120.0], 5.0)).upper AS upper;
-- Returns: lower = [95.0, 105.0, 115.0], upper = [105.0, 115.0, 125.0]
ts_conformal_predict
Full split conformal prediction: computes conformity score from residuals and applies to forecasts.
ts_conformal_predict(residuals DOUBLE[], forecasts DOUBLE[], alpha DOUBLE)
→ STRUCT(
point DOUBLE[],
lower DOUBLE[],
upper DOUBLE[],
coverage DOUBLE,
conformity_score DOUBLE,
method VARCHAR
)
Parameters:
residuals: Calibration residuals (actual - predicted)forecasts: Point forecasts to generate intervals foralpha: Miscoverage rate
Example:
WITH backtest_residuals AS (
SELECT [1.2, -0.8, 1.5, -1.0, 0.5]::DOUBLE[] AS residuals
),
future_forecasts AS (
SELECT [100.0, 102.0, 104.0]::DOUBLE[] AS forecasts
)
SELECT
(ts_conformal_predict(residuals, forecasts, 0.1)).*
FROM backtest_residuals, future_forecasts;
ts_conformal_predict_asymmetric
Asymmetric conformal prediction that computes separate upper and lower quantiles. Useful when residuals are not symmetric (e.g., skewed demand forecasts).
ts_conformal_predict_asymmetric(residuals DOUBLE[], forecasts DOUBLE[], alpha DOUBLE)
→ STRUCT(
point DOUBLE[],
lower DOUBLE[],
upper DOUBLE[],
coverage DOUBLE,
conformity_score DOUBLE,
method VARCHAR
)
Example:
-- Asymmetric residuals (over-predictions more common)
SELECT
(ts_conformal_predict_asymmetric(
[-0.5, -0.3, 0.2, 1.0, 2.5],
[100.0, 110.0],
0.1
)).*;
ts_mean_interval_width
Computes the mean width of prediction intervals. Useful for comparing interval sharpness across models.
ts_mean_interval_width(lower DOUBLE[], upper DOUBLE[]) → DOUBLE
Example:
SELECT
ts_mean_interval_width([95.0, 105.0], [105.0, 115.0]) AS model_a_width,
ts_mean_interval_width([90.0, 100.0], [110.0, 120.0]) AS model_b_width;
-- Model A: 10.0 (sharper), Model B: 20.0 (wider)
ts_conformal_coverage
Computes empirical coverage: the fraction of actual values that fall within the prediction intervals.
ts_conformal_coverage(actuals DOUBLE[], lower DOUBLE[], upper DOUBLE[]) → DOUBLE
Example:
SELECT ts_conformal_coverage(
[100.0, 105.0, 110.0],
[95.0, 100.0, 105.0],
[105.0, 110.0, 115.0]
) AS empirical_coverage;
-- Returns: 1.0 (all actuals within intervals)
ts_conformal_evaluate
Full evaluation of prediction intervals: coverage, violation rate, mean width, and Winkler score.
ts_conformal_evaluate(actuals DOUBLE[], lower DOUBLE[], upper DOUBLE[], alpha DOUBLE)
→ STRUCT(
coverage DOUBLE,
violation_rate DOUBLE,
mean_width DOUBLE,
winkler_score DOUBLE,
n_observations INTEGER
)
Example:
SELECT (ts_conformal_evaluate(
[100.0, 105.0, 110.0, 95.0],
[97.0, 102.0, 107.0, 92.0],
[103.0, 108.0, 113.0, 98.0],
0.1
)).*;
Return fields:
coverage: Fraction of actuals within intervals (target:1 - alpha)violation_rate: Fraction of actuals outside intervalsmean_width: Average interval width (narrower = sharper)winkler_score: Combined width + penalty for violations (lower = better)n_observations: Number of observations evaluated
Table Macros
ts_conformal_by
High-level macro that performs conformal prediction on grouped backtest results.
ts_conformal_by(
backtest_results VARCHAR,
group_col COLUMN,
actual_col COLUMN,
forecast_col COLUMN,
point_forecast_col COLUMN,
params STRUCT
) → TABLE
Params Options:
| Key | Type | Default | Description |
|---|---|---|---|
alpha | DOUBLE | 0.1 | Miscoverage rate |
method | VARCHAR | 'split' | 'split' or 'asymmetric' |
Example:
SELECT * FROM ts_conformal_by(
'backtest_results',
product_id,
actual,
forecast,
point_forecast,
{'alpha': 0.1, 'method': 'split'}
);
ts_conformal_calibrate
Calibrates a conformity score from backtest residuals.
ts_conformal_calibrate(
backtest_results VARCHAR,
actual_col COLUMN,
forecast_col COLUMN,
params STRUCT
) → TABLE(conformity_score DOUBLE, coverage DOUBLE, n_residuals INTEGER)
Returns: Single row table with:
conformity_score: Computed quantile for intervalscoverage: Theoretical coverage probabilityn_residuals: Number of residuals used
Example:
SELECT * FROM ts_conformal_calibrate(
'backtest_results',
actual,
forecast,
{'alpha': 0.05}
);
ts_conformal_apply_by
Applies a pre-computed conformity score to grouped forecast results.
ts_conformal_apply_by(
forecast_results VARCHAR,
group_col COLUMN,
forecast_col COLUMN,
conformity_score DOUBLE
) → TABLE
Example:
-- Two-step workflow: calibrate then apply
CREATE TABLE calibration AS
SELECT * FROM ts_conformal_calibrate('backtest', actual, forecast, {'alpha': 0.1});
SELECT * FROM ts_conformal_apply_by(
'future_forecasts',
product_id,
yhat,
(SELECT conformity_score FROM calibration)
);
ts_interval_width_by
Computes mean interval width for grouped forecast results.
ts_interval_width_by(
results VARCHAR,
group_col COLUMN,
lower_col COLUMN,
upper_col COLUMN
) → TABLE(group_col, mean_width DOUBLE, n_intervals BIGINT)
Example:
SELECT * FROM ts_interval_width_by(
'forecast_results',
product_id,
lower,
upper
);
Complete Workflow
-- Step 1: Generate backtest forecasts
CREATE TABLE cv_folds AS
SELECT * FROM ts_cv_folds_by('sales_data', store_id, date, revenue, 5, 7, MAP{});
CREATE TABLE backtest_results AS
SELECT * FROM ts_cv_forecast_by('cv_folds', store_id, date, revenue, 'AutoETS',
MAP{'seasonal_period': '7'});
-- Step 2: Calibrate conformity score (90% coverage)
CREATE TABLE calibration AS
SELECT * FROM ts_conformal_calibrate(
'backtest_results',
y,
yhat,
{'alpha': 0.1}
);
-- Step 3: Generate future forecasts
CREATE TABLE future_forecasts AS
SELECT * FROM ts_forecast_by(
'sales_data', store_id, date, revenue,
'AutoETS', 7, '1d', MAP{'seasonal_period': '7'}
);
-- Step 4: Apply conformal intervals
SELECT * FROM ts_conformal_apply_by(
'future_forecasts',
store_id,
yhat,
(SELECT conformity_score FROM calibration)
);
-- Step 5: Evaluate interval quality
SELECT (ts_conformal_evaluate(
LIST(y ORDER BY ds),
LIST(yhat_lower ORDER BY ds),
LIST(yhat_upper ORDER BY ds),
0.1
)).*
FROM backtest_results;
When to Use Conformal Prediction
| Scenario | Recommendation |
|---|---|
| Need guaranteed coverage | Use conformal prediction |
| Non-normal residuals | Use asymmetric conformal |
| Comparing model uncertainty | Use ts_mean_interval_width |
| Production intervals | Calibrate on backtest, apply to future |
| Small calibration set | Use conformal (valid for any n) |
| Evaluating interval quality | Use ts_conformal_evaluate for Winkler scores |
Comparison with Parametric Intervals
| Aspect | Parametric | Conformal |
|---|---|---|
| Distribution assumption | Requires normality | Distribution-free |
| Coverage guarantee | Asymptotic | Finite-sample valid |
| Interval shape | Always symmetric | Can be asymmetric |
| Calibration data | Not required | Required |
| Computation | Analytical | Empirical quantiles |
The practical advantage of conformal prediction is significant for S&OP planning: parametric intervals assume normal residuals, which rarely holds for real-world demand data that exhibits skewness, heavy tails, and heteroscedasticity. Conformal prediction sidesteps these assumptions entirely. The 4-step production workflow -- backtest with cross-validation, calibrate a conformity score, generate future forecasts, apply conformal intervals -- runs as 4 SQL statements and produces intervals with mathematically guaranteed coverage for any forecast model in the AnoFox library.
Frequently Asked Questions
What does "distribution-free" mean in practice?
Distribution-free means conformal prediction does not assume your forecast errors follow a normal (Gaussian) distribution. It uses the empirical distribution of calibration residuals to construct intervals. This makes it valid for skewed, heavy-tailed, or otherwise non-standard error distributions that are common in real-world demand forecasting.
How many calibration residuals do I need?
Conformal prediction is theoretically valid for any sample size, but practical accuracy improves with more calibration data. For 90% coverage (alpha=0.1), at least 20-30 residuals give reasonable results. For 95% coverage (alpha=0.05), aim for 50+ residuals. Use ts_conformal_evaluate to verify that empirical coverage matches the target.
When should I use asymmetric vs. symmetric conformal prediction?
Use symmetric conformal (ts_conformal_predict) when your forecast errors are roughly symmetric around zero. Use asymmetric conformal (ts_conformal_predict_asymmetric) when errors are skewed, which is common in demand forecasting where over-predictions and under-predictions have different magnitudes. Asymmetric intervals produce tighter, more realistic bounds for skewed data.
What is the Winkler score and how do I interpret it?
The Winkler score is a combined metric that penalizes both interval width and coverage violations. A narrow interval that covers all actuals gets a low (good) score. Violations incur a penalty proportional to how far the actual falls outside the interval. Lower Winkler scores indicate better interval quality. Use it to compare different models' prediction intervals.