Regularized Models
AnoFox provides 2 regularized regression methods: Ridge (L2 penalty) and Elastic Net (combined L1+L2 penalty). Ridge shrinks coefficients toward zero without eliminating them, making it the right choice when VIF exceeds 5 for any predictor. Elastic Net adds L1 regularization for automatic feature selection -- controlled by the l1_ratio parameter (0 = pure Ridge, 1 = pure Lasso) -- with convergence tunable via max_iterations (default: 1,000) and tolerance (default: 1e-4). Both methods support all 4 SQL integration patterns including GROUP BY for per-segment models.
Regularized regression refers to a family of techniques that add penalty terms to the loss function to prevent overfitting and handle multicollinearity. The penalty shrinks coefficient estimates toward zero, trading a small increase in bias for a larger reduction in variance.
Ridge - L2 Regularization
Ridge regression adds an L2 penalty (sum of squared coefficients) to the OLS loss function, shrinking all coefficients toward zero without eliminating any. The regularization strength is controlled by the alpha parameter: higher values produce more shrinkage.
Variants
- Scalar Fit:
anofox_stats_ridge_fit(y, x, alpha, [options]) -> STRUCT - Aggregate Fit:
anofox_stats_ridge_fit_agg(y, x, alpha, [options]) -> STRUCT - Window Predict:
anofox_stats_ridge_fit_predict(y, x, [options]) OVER (...) -> STRUCT - Batch Predict:
anofox_stats_ridge_predict_agg(y, x, [options]) -> LIST(STRUCT)
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
y | LIST(DOUBLE) / DOUBLE | Yes | - | Target values |
x | LIST(LIST(DOUBLE)) / LIST(DOUBLE) | Yes | - | Predictor matrix |
alpha | DOUBLE | Yes | - | Regularization strength (0.01-10.0) |
options | MAP | No | - | Configuration options |
Options MAP:
| Option | Type | Default | Description |
|---|---|---|---|
fit_intercept | BOOLEAN | true | Include intercept term |
compute_inference | BOOLEAN | false | Compute inference statistics |
confidence_level | DOUBLE | 0.95 | Confidence level |
Example
SELECT
region,
(model).r_squared as fit,
(model).coefficients[2] as price_elasticity
FROM (
SELECT
region,
anofox_stats_ridge_fit_agg(
sales,
[price, promotion],
0.5,
MAP {'compute_inference': 'true'}
) as model
FROM regional_sales
GROUP BY region
);
When to use Ridge:
- VIF > 5 for any predictor
- More predictors than observations
- Coefficients unstable across samples
Elastic Net - Combined L1+L2
Elastic Net combines L1 (Lasso) and L2 (Ridge) penalties, enabling both coefficient shrinkage and automatic feature selection. The l1_ratio parameter controls the balance: 0 is pure Ridge, 1 is pure Lasso, and values in between provide a mixture.
Variants
- Scalar Fit:
anofox_stats_elasticnet_fit(y, x, alpha, l1_ratio, [options]) -> STRUCT - Aggregate Fit:
anofox_stats_elasticnet_fit_agg(y, x, alpha, l1_ratio, [options]) -> STRUCT - Window Predict:
anofox_stats_elasticnet_fit_predict(y, x, [options]) OVER (...) -> STRUCT - Batch Predict:
anofox_stats_elasticnet_predict_agg(y, x, [options]) -> LIST(STRUCT)
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
y | LIST(DOUBLE) / DOUBLE | Yes | - | Target values |
x | LIST(LIST(DOUBLE)) / LIST(DOUBLE) | Yes | - | Predictor matrix |
alpha | DOUBLE | Yes | - | Overall regularization (0.01-10.0) |
l1_ratio | DOUBLE | Yes | - | L1/L2 balance (0=Ridge, 1=Lasso) |
options | MAP | No | - | Configuration options |
Options MAP:
| Option | Type | Default | Description |
|---|---|---|---|
fit_intercept | BOOLEAN | true | Include intercept term |
max_iterations | INTEGER | 1000 | Convergence limit |
tolerance | DOUBLE | 1e-4 | Convergence threshold |
Example
SELECT anofox_stats_elasticnet_fit_agg(
y,
[x1, x2, x3, x4, x5],
0.5, -- alpha: regularization strength
0.7 -- l1_ratio: 70% L1, 30% L2
) as model
FROM high_dim_data;
When to use Elastic Net:
- High-dimensional data (many predictors)
- Feature selection needed (sparse solutions)
- Correlated predictors with variable selection