Changepoint Detection
Find structural breaks and regime shifts in your time series data using Bayesian Online Changepoint Detection.
ts_detect_changepoints_by
Detect structural breaks in grouped time series. Also aliased as ts_detect_changepoints.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
source | VARCHAR | Yes | - | Source table name |
group_col | COLUMN | Yes | - | Group column (unquoted) |
date_col | COLUMN | Yes | - | Date column (unquoted) |
value_col | COLUMN | Yes | - | Value column (unquoted) |
params | MAP | No | MAP | Configuration |
Params MAP Options:
| Option | Type | Default | Description |
|---|---|---|---|
hazard_lambda | DOUBLE | 250.0 | Hazard rate. Lower = more changepoints detected |
Output
Returns one row per data point:
| Column | Type | Description |
|---|---|---|
group_col | (input) | Series identifier (name preserved) |
date_col | TIMESTAMP | Date (name preserved) |
is_changepoint | BOOLEAN | Detected changepoint? |
changepoint_probability | DOUBLE | Probability (0-1) |
Row preservation: output rows = input rows. NULL dates, groups with < 2 points, or errors get is_changepoint=false, changepoint_probability=NULL.
Examples
-- Detect changepoints across products
SELECT
product_id,
date,
is_changepoint,
changepoint_probability
FROM ts_detect_changepoints_by(
'sales_by_product',
product_id,
date,
sales,
MAP{'hazard_lambda': '250'}
)
WHERE is_changepoint = true;
-- Count changepoints per series
SELECT product_id, COUNT(*) FILTER (WHERE is_changepoint) AS n_changepoints
FROM ts_detect_changepoints_by('sales', product_id, date, quantity, MAP{'hazard_lambda': '100'})
GROUP BY product_id;
Sensitivity Tuning
hazard_lambda | Sensitivity | Use Case |
|---|---|---|
| 50-100 | High | Find subtle changes |
| 250 (default) | Medium | General purpose |
| 500-1000 | Low | Only major regime changes |
Handling Changepoints in Forecasting
When changepoints are detected, you can use them to improve forecast accuracy.
Option 1: Filter to Recent Regime
-- Find last changepoint
WITH last_change AS (
SELECT MAX(date) as change_date
FROM ts_detect_changepoints_by('sales_data', product_id, date, sales, MAP{})
WHERE is_changepoint = true
)
-- Forecast using only data after last changepoint
CREATE TABLE recent_data AS
SELECT * FROM sales_data WHERE date > (SELECT change_date FROM last_change);
SELECT * FROM ts_forecast_by(
'recent_data', product_id, date, sales,
'AutoETS', 30, '1d', MAP{}
);
Option 2: Segment and Forecast by Regime
-- Identify regimes
CREATE TABLE regimes AS
WITH changes AS (
SELECT product_id, date
FROM ts_detect_changepoints_by('sales_data', product_id, date, sales, MAP{})
WHERE is_changepoint = true
)
SELECT
s.*,
SUM(CASE WHEN s.date IN (SELECT date FROM changes WHERE changes.product_id = s.product_id) THEN 1 ELSE 0 END)
OVER (PARTITION BY s.product_id ORDER BY s.date) as regime
FROM sales_data s;
-- Forecast each regime separately
SELECT * FROM ts_forecast_by(
'regimes', regime, date, sales,
'AutoETS', 30, '1d', MAP{}
);
Complete Changepoint Analysis
-- Comprehensive changepoint analysis
WITH changepoints AS (
SELECT
product_id,
date,
is_changepoint,
changepoint_probability
FROM ts_detect_changepoints_by(
'sales_data',
product_id,
date,
sales,
MAP{'hazard_lambda': '250'}
)
)
SELECT
product_id,
COUNT(*) FILTER (WHERE is_changepoint) AS n_changepoints,
MIN(date) FILTER (WHERE is_changepoint) AS first_changepoint,
MAX(date) FILTER (WHERE is_changepoint) AS last_changepoint,
ROUND(AVG(changepoint_probability) FILTER (WHERE is_changepoint), 3) AS avg_probability
FROM changepoints
GROUP BY product_id;