AnoFox provides 5 correlation methods spanning linear (Pearson), rank-based (Spearman, Kendall), nonlinear (distance correlation), and reliability (Intraclass Correlation Coefficient with 3 model types). Distance correlation is uniquely powerful: it equals zero if and only if the two variables are statistically independent -- unlike Pearson, which only detects linear relationships. ICC supports one-way, two-way random, and two-way mixed models for inter-rater reliability assessment. All correlation functions return p-values, confidence intervals, and sample sizes alongside the correlation coefficient.
Correlation is a statistical measure that quantifies the strength and direction of the relationship between two variables. A correlation of +1 indicates a perfect positive relationship, -1 a perfect negative relationship, and 0 no relationship. Different correlation methods capture different types of relationships: linear (Pearson), monotonic (Spearman, Kendall), or general dependence (distance correlation).
Linear Correlation
Pearson Correlation
Linear correlation with significance test.
Parameters
| Parameter | Type | Required | Default | Description |
|---|
x | DOUBLE | Yes | - | First variable |
y | DOUBLE | Yes | - | Second variable |
options | MAP | No | - | Configuration options |
Output
| Field | Type | Description |
|---|
r | DOUBLE | Pearson correlation (-1 to 1) |
p_value | DOUBLE | p-value |
t_statistic | DOUBLE | t-statistic |
ci_lower | DOUBLE | CI lower bound |
ci_upper | DOUBLE | CI upper bound |
n | BIGINT | Sample size |
Example
SELECT anofox_stats_pearson_agg(x, y) as result
FROM data;
Interpretation:
| r value | Strength |
|---|
| 0.0 - 0.3 | Weak |
| 0.3 - 0.7 | Moderate |
| 0.7 - 1.0 | Strong |
Rank Correlations
Spearman Rank Correlation
Monotonic relationship (robust to outliers).
Parameters
| Parameter | Type | Required | Default | Description |
|---|
x | DOUBLE | Yes | - | First variable |
y | DOUBLE | Yes | - | Second variable |
options | MAP | No | - | Configuration options |
Output
| Field | Type | Description |
|---|
rho | DOUBLE | Spearman's rho |
p_value | DOUBLE | p-value |
n | BIGINT | Sample size |
Example
SELECT anofox_stats_spearman_agg(x, y) as result
FROM data;
Kendall's Tau
Rank correlation (handles ties well).
Parameters
| Parameter | Type | Required | Default | Description |
|---|
x | DOUBLE | Yes | - | First variable |
y | DOUBLE | Yes | - | Second variable |
options | MAP | No | - | Configuration options |
Output
| Field | Type | Description |
|---|
tau | DOUBLE | Kendall's tau |
p_value | DOUBLE | p-value |
n | BIGINT | Sample size |
Example
SELECT anofox_stats_kendall_agg(x, y) as result
FROM data;
Nonlinear Correlation
Distance Correlation
Detects nonlinear relationships using the distance correlation measure.
Parameters
| Parameter | Type | Required | Default | Description |
|---|
x | DOUBLE | Yes | - | First variable |
y | DOUBLE | Yes | - | Second variable |
Output
| Field | Type | Description |
|---|
dcor | DOUBLE | Distance correlation (0 to 1) |
dcov | DOUBLE | Distance covariance |
n | BIGINT | Sample size |
Example
SELECT anofox_stats_distance_cor_agg(x, y) as result
FROM data;
Key property: Distance correlation = 0 if and only if X and Y are independent (unlike Pearson which only detects linear relationships).
Reliability Measures
Intraclass Correlation (ICC)
Reliability/agreement between raters.
Parameters
| Parameter | Type | Required | Default | Description |
|---|
value | DOUBLE | Yes | - | Rating value |
rater_id | INTEGER | Yes | - | Rater identifier |
subject_id | INTEGER | Yes | - | Subject identifier |
options | MAP | No | - | Model configuration |
Options MAP:
| Option | Type | Default | Description |
|---|
model | VARCHAR | two_way_random | one_way, two_way_random, two_way_mixed |
Example
SELECT anofox_stats_icc_agg(
value,
rater_id,
subject_id,
MAP {'model': 'two_way_random'}
) as result
FROM rating_data;
Interpretation:
| ICC value | Reliability |
|---|
| < 0.5 | Poor |
| 0.5 - 0.75 | Moderate |
| 0.75 - 0.9 | Good |
| > 0.9 | Excellent |
Choosing a Correlation Method
| Scenario | Recommended |
|---|
| Linear relationship, normal data | Pearson |
| Outliers present | Spearman |
| Ordinal data | Spearman or Kendall |
| Many ties | Kendall |
| Nonlinear relationship | Distance Correlation |
| Rater agreement | ICC |