Categorical Tests
AnoFox provides 9 functions for categorical data analysis: 5 independence and goodness-of-fit tests (chi-square independence, chi-square GOF, G-test, Fisher's exact for 2x2 tables, McNemar's for paired before/after data) and 4 effect size measures (Cramer's V, Phi coefficient, contingency coefficient, Cohen's Kappa for inter-rater agreement). Fisher's exact test is the correct choice for small sample 2x2 tables where chi-square approximations break down. Cohen's Kappa returns values on a -1 to 1 scale where 0.8+ indicates almost perfect agreement between raters.
Categorical tests are statistical methods for analyzing data where variables take on discrete categories rather than continuous values. These tests determine whether observed frequencies differ significantly from expected frequencies (goodness-of-fit), whether two categorical variables are related (independence), or whether raters agree on their classifications (agreement).
Independence Tests
Chi-Square Test of Independence
Tests whether two categorical variables are associated.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
row_var | INTEGER | Yes | - | Row category |
col_var | INTEGER | Yes | - | Column category |
options | MAP | No | - | yates_correction (default: false) |
Output
| Field | Type | Description |
|---|---|---|
chi2 | DOUBLE | Chi-square statistic |
p_value | DOUBLE | p-value |
df | BIGINT | Degrees of freedom |
cramers_v | DOUBLE | Effect size |
Example
SELECT anofox_stats_chisq_test_agg(
row_var,
col_var
) as result
FROM contingency_data;
Chi-Square Goodness of Fit
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
observed | INTEGER | Yes | - | Observed counts |
expected | DOUBLE | Yes | - | Expected counts |
Example
SELECT anofox_stats_chisq_gof_agg(
observed,
expected
) as result
FROM frequency_data;
G-Test (Log-Likelihood Ratio)
Alternative to chi-square, better for small samples.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
row_var | INTEGER | Yes | - | Row category |
col_var | INTEGER | Yes | - | Column category |
Example
SELECT anofox_stats_g_test_agg(
row_var,
col_var
) as result
FROM data;
Fisher's Exact Test
Exact test for 2x2 tables.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
row_var | INTEGER | Yes | - | Row category |
col_var | INTEGER | Yes | - | Column category |
options | MAP | No | - | alternative setting |
Example
SELECT anofox_stats_fisher_exact_agg(
row_var,
col_var
) as result
FROM data;
McNemar's Test
Paired categorical data (before/after).
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
var1 | INTEGER | Yes | - | Before measurement |
var2 | INTEGER | Yes | - | After measurement |
options | MAP | No | - | Configuration options |
Example
SELECT anofox_stats_mcnemar_agg(
var1,
var2
) as result
FROM paired_categorical;
Effect Size Measures
Quantify the magnitude of associations.
Cramér's V
Effect size for chi-square test.
SELECT anofox_stats_cramers_v_agg(row_var, col_var) as v
FROM data;
Interpretation:
| V value | Effect Size |
|---|---|
| 0.0 - 0.1 | Negligible |
| 0.1 - 0.3 | Small |
| 0.3 - 0.5 | Medium |
| > 0.5 | Large |
Phi Coefficient
Effect size for 2x2 tables.
SELECT anofox_stats_phi_coefficient_agg(row_var, col_var) as phi
FROM data;
Contingency Coefficient
Normalized association measure.
SELECT anofox_stats_contingency_coef_agg(row_var, col_var) as c
FROM data;
Cohen's Kappa
Inter-rater agreement measure.
Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
rater1 | INTEGER | Yes | - | First rater's classification |
rater2 | INTEGER | Yes | - | Second rater's classification |
Output
| Field | Type | Description |
|---|---|---|
kappa | DOUBLE | Cohen's kappa (-1 to 1) |
p_value | DOUBLE | p-value |
se | DOUBLE | Standard error |
ci_lower | DOUBLE | CI lower |
ci_upper | DOUBLE | CI upper |
Example
SELECT anofox_stats_cohen_kappa_agg(
rater1,
rater2
) as result
FROM rating_data;
Interpretation:
| Kappa | Agreement |
|---|---|
| < 0 | Less than chance |
| 0.0 - 0.2 | Slight |
| 0.2 - 0.4 | Fair |
| 0.4 - 0.6 | Moderate |
| 0.6 - 0.8 | Substantial |
| 0.8 - 1.0 | Almost perfect |
Choosing a Test
| Scenario | Recommended |
|---|---|
| 2+ categories, large samples | Chi-Square |
| 2x2 table, small samples | Fisher's Exact |
| Before/after paired data | McNemar's |
| Prefer likelihood-based | G-Test |
| Quantify association | Cramér's V |
| Rater agreement | Cohen's Kappa |